How is insertion and deletion more faster in red black tree than AVL tree? - algorithm

I would like to understand the difference bit better, but haven't found a source that can break it down to my level.
I am aware that both trees require at most 2 rotations per insertion. Then how is insertion faster in red-black trees?
And how insertion requires O(log n) rotations in avl tree while O(1) in red-black?

Well, I don't know what your level is, exactly, but to put it simply, red-black trees are less balanced than AVL trees. For red-black trees, the path from the root to the furthest leaf is no more than twice as long as the path from the root to the nearest leaf, while for AVL trees there is never more than one level difference between two neighboring subtrees. This makes insertions and deletions slightly more costly in AVL trees but lookup faster. The asymptotic and worst-case behavior of the two data structures is identical though (the runtime (not number of rotations) is O(log n) for insertions in both cases, the O(1) you mentioned is the so-called amortized runtime).
See this paragraph for a short comparison of the two data structures.

Insertion and deletion is not faster in red-black trees. This is a common ASSUMPTION and the assumption is based on the fact that red-black trees perform slightly fewer rotations on average per insert than AVL (.6 vs .7).
You can check for yourself in Java comparing TreeMap(red-black) to this implementation of TreeMapAVL and you can get exact numbers instead of the common, but incorrect, assumptions. https://github.com/dmcmanam/bbst-showdown

Related

Are there any advantages or specific cases where we should prefer using Binary search tree rather than AVL tree?

Are there any advantages or specific cases where we should prefer using Binary search tree rather than AVL tree.
If you do not care about the time complexity of lookup/insert/remove operations, then BST is good enough. It's easier to implement and requires less space. However, in the worst case, its performance is O(n) - imagine adding only increasing or decreasing elements to your BST.
On the other hand, if you do care about the performance, then you may use an AVL tree because it is a self-balancing BST - its height is guaranteed to be ~ log(n), where n is a number of nodes in the tree. That's why lookup lookup/insert/remove operations are logarithmic. However, an AVL tree requires more space (each node needs to hold its height), and additional logic to re-balance the tree if such property gets violated.

Red black tree over avl tree

AVL and Red black trees are both self-balancing except Red and black color in the nodes. What's the main reason for choosing Red black trees instead of AVL trees? What are the applications of Red black trees?
What's the main reason for choosing Red black trees instead of AVL trees?
Both red-black trees and AVL trees are the most commonly used balanced binary search trees and they support insertion, deletion and look-up in guaranteed O(logN) time. However, there are following points of comparison between the two:
AVL trees are more rigidly balanced and hence provide faster look-ups. Thus for a look-up intensive task use an AVL tree.
For an insert intensive tasks, use a Red-Black tree.
AVL trees store the balance factor at each node. This takes O(N) extra space. However, if we know that the keys that will be inserted in the tree will always be greater than zero, we can use the sign bit of the keys to store the colour information of a red-black tree. Thus, in such cases red-black tree takes no extra space.
What are the application of Red black tree?
Red-black trees are more general purpose. They do relatively well on add, remove, and look-up but AVL trees have faster look-ups at the cost of slower add/remove. Red-black tree is used in the following:
Java: java.util.TreeMap, java.util.TreeSet
C++ STL (in most implementations): map, multimap, multiset
Linux kernel: completely fair scheduler, linux/rbtree.h
Try reading this article
It offers some good insights on differences, similarities, performance, etc.
Here's a quote from the article:
RB-Trees are, as well as AVL trees, self-balancing. Both of them provide O(log n) lookup and insertion performance.
The difference is that RB-Trees guarantee O(1) rotations per insert operation. That is what actually costs performance in real implementations.
Simplified, RB-Trees gain this advantage from conceptually being 2-3 trees without carrying around the overhead of dynamic node structures. Physically RB-Trees are implemented as binary trees, the red/black-flags simulate 2-3 behaviour
As far as my own understanding goes, AVL trees and RB trees are not very far off in terms of performance. An RB tree is simply a variant of a B-tree and balancing is implemented differently than an AVL tree.
Our understanding of the differences in performance has improved over the years and now the main reason to use red-black trees over AVL would be not having access to a good AVL implementation since they are slightly less common perhaps because they are not covered in CLRS.
Both trees are now considered forms of rank-balanced trees but red-black trees are consistently slower by about 20% in real world tests. Or even 30-40% slower when sequential data is inserted.
So people who have studied red-black trees but not AVL trees tend to choose red-black trees. The primary uses for red-black trees are detailed on the Wikipedia entry for them.
Other answers here sum up the pros & cons of RB and AVL trees well, but I found this difference particularly interesting:
AVL trees do not support constant amortized update cost [but red-black trees do]
Source: Mehlhorn & Sanders (2008) (section 7.4)
So, while both RB and AVL trees guarantee O(log(N)) worst-case time for lookup, insert and delete, restoring the AVL/RB property after inserting or deleting a node can be done in O(1) amortized time for red-black trees.
Insertions in AVL trees and in RB trees both require a maximum of 2 rotations. From https://adtinfo.org/ :
The primary advantage of red-black trees is that, in AVL trees, deleting one node from a tree containing n nodes may require log 2 n rotations, but deletion in a red-black tree never requires more than three rotations.
They're both the most commonly used tree types but Red black trees have faster insertions because they have relaxed balancing apart from that they both have O(log N) search time something that's common between them. They are both equally good but AVL is usually more balanced because of it's 1.44 logN complexity over 2 logN complexity

Implementation of priority queue by AVL Tree data structure

Priority queue:
Basic operations: Insertion
Delete (Delete minumum element)
Goal: To provide efficient running time or order of growth for above functionality.
Implementation of Priority queue By:
Linked List: Insertion will take o(n) in case of insertion at end o(1) in case of
insertion at head.
Delet (Finding minumum and Delete this ) will take o(n)
BST:
Insertion/Deltion of minimum = In avg case it will take o(logn) worst case 0(n)
AVL Tree:
Insertion/deletion/searching: o(log n) in all cases.
My confusion goes here:
Why not we have used AVL Tree for implementation of Priority queue, Why we gone
for Binary heap...While as we know that in AVL Tree we can do insertion/ Deletion/searching in o(log n) in worst case.
Complexity isn't everything, there are other considerations for actual performance.
For most purposes, most people don't even use an AVL tree as a balanced tree (Red-Black trees are more common as far as I've seen), let alone as a priority queue.
This is not to say that AVL trees are useless, I quite like them. But they do have a relatively expensive insert. What AVL trees are good for (beating even Red-Black trees) is doing lots and lots of lookups without modification. This is not what you need for a priority queue.
As a separate consideration -- never mind your O(log n) insert for a binary heap, a fibonacci heap has O(1) insert and O(log N) delete-minimum. There are a lot of data structures to choose from with slightly different trade-offs, so you wouldn't expect to see everyone just pick the first thing that satisfies your (quite brief) criteria.
Binary heap is not Binary Search Tree (BST). If severely unbalanced / deteriorated into a list, it will indeed take O(n) time. Heaps are usually always O(log(n)) or better. IIRC Sedgewick claimed O(1) average-time for array-based heaps.
Why not AVL? Because it maintains too much order in a structure. Too much order means, too much effort went into maintaining that order. The less order we can get away with, the better - it will usually translate to faster operations. For example, RBTs are better than AVL trees. RBTs, red-black trees, are almost balanced trees - they save operations while still ensuring O(log(n)) time.
But any tree is totally-ordered structure, so heaps are generally better, because they only ensure that the minimal element is on top. They are only partially ordered.
Because in a binary heap the minimum element is the root.

Using a 2-3-4 tree instead of a splay tree

I'm in a data structures course right now and we learned about 2-3-4 trees and splay trees. I was wondering in what circumstances would you use a 2-3-4 tree instead of a splay tree? They're both self balancing and sorted so I don't see that much of a difference between them.
A 2-3-4 tree only changes the structure on insertions and deletions, while a splay-tree also re-organizes the nodes on searches.
Splay trees will, thanks to the re-organization on lookup, provide faster responses if your typical usage pattern happens to look up a small subset of elements most of the time.
It is possible to implement a 2-3-4 tree such that the smallest element can be looked up in O(1), but generally both offer insertion and deletion at amortized O(log n).

Applications of red-black trees

What are the applications of red-black (RB) trees? Is there any application where only RB Trees can be used and no other data structures?
A red-black tree is a particular implementation of a self-balancing binary search tree, and today it seems to be the most popular choice of implementation.
Binary search trees are used to implement finite maps, where you store a set of keys with associated values. You can also implement sets by only using the keys and not storing any values.
Balancing the tree is needed to guarantee good performance, as otherwise the tree could degenerate into a list, for example if you insert keys which are already sorted.
The advantage of search trees over hash tables is that you can traverse the tree efficiently in sort order.
AVL-trees are another variant of balanced binary search trees. They were popular before red-black trees were known. They are more carefully balanced, with a maximal difference of one between the heights of the left and right subtree (RB trees guarantee at most a factor of two). Their main drawback is that rebalancing takes more effort.
So red-black trees are certainly a good but not the only choice for this application.
Red Black Trees are from a class of self balancing BSTs and as answered by others, any such self balancing tree can be used. I would like to add that Red-black trees are widely used as system symbol tables. For example they are used in implementing the following:
Java: java.util.TreeMap , java.util.TreeSet .
C++ STL: map, multimap, multiset.
Linux kernel: completely fair scheduler, linux/rbtree.h
Unless you have very specific performance requirements, an R-B tree could be replaced by some other self-balancing binary tree, for example an AVL tree. Choosing between the two of them is basically a performance optimization - they offer the same basic operations.
Not that either of them is definitively "faster" than the other, just that they're different enough that specific uses of them will tend to have slightly different performance, all else being equal. So if you draw your requirements carefully enough, or just by chance, you could end up with one of them being "fast enough" for your use, and the other not. R-B offers slightly faster insertion than AVL, at the cost of slightly slower lookup.
There is no such rule like red black can only be used in a particular case
it depends upon the application in cases like when You have to build the tree only once and you have to query it many times then you can go for a AVL tree because in AVL tree searching is quite fast.. But it is strictly balanced so insertion and deletion may take some time
AVl tree may be used for language dictionery where You have to build the data structure just once
and the red black tree is used in the Completely Fair Scheduler used in current Linux kernels now a days..
the constraints applied on the red black tree also enforce the point that that that the path from the root to the furthest leaf is no more than twice as long as the path from the root to the nearest leaf.
BTW you can look for the various seach and insert etc time required for a red black tree down here
Average Worst case
Space O(n) O(n)
Search O(log n) O(log n)
Insert O(log n) O(log n)
Delete O(log n) O(log n)

Resources