undergrad general time complexity:comparing B-tree over black red tree - time

I am new at data structure. I understand we use B tree to minimize disk rotation, but why we use black-red tree for memory over B-tree? Isnt that both perform at O(log n)? In my opnion B-tree has smaller height and require less space (can have t-1 to 2t-1 keys), while black-red tree must have 2 child for internal node.

Red black B tree is a self balancing B tree. Non self balancing B trees can become inefficient, though you and process them and manually rebalance any time that means a big fat lock!
Both being binary trees, yes they are both O(log n), until the B tree becomes unbalanced over time, and then O(log n) is not guaranteed any more.

The main cost of red-black tree boils down to a very simple problem of swapping out a single red capacity near the root to the perfect black tree top.
Assuming red node generation is uniform, the total cost sum of uniform subtree volumes is quadratic. Because the black height is basically invariant, a single red node generates global red waterfall, self-amplifying while falling up to the top.
The red capacity is linear to the black-tree volume. The extra red-absorption capacity is relative to the black tree height. Every single red capacity absorption moves up exactly one black hole. The black hole moves up step by step.
The total red capacity flow is described as a simple red heat flow to the black tree top, which is a very simple heatbath.

Related

Complexity analysis exercise on RB-Trees

BLACK_PATH(T,x)
if x==NIL
then return TRUE
if COLOR(x)==BLACK
then return BLACK_PATH(T,left(x)) || BLACK_PATH(T,right(x))
return FALSE
The exercises asks to analyse the complexity of this procedure. I believe the reccurrence is the following
T(n)<=2T(2n/3)+O(1)
Using the recursion tree I obtain T(n)=O(n). Is this correct?
The complexity of this method is linear (O(n)) in the worst case with regards to the number of elements in the tree.
Using the master theorem in terms of the total number of nodes here is difficult because it does not take into account the properties of a red black tree. While it is true in general for heaps that every subtree of a tree with n nodes has max 2n/3 nodes, it is also true that for red black trees every subtree has at max n/2 black nodes. This is because red black trees are balanced with respect to black nodes (every path downwards to a leaf node from an arbitrary node has the same number of black nodes).
Most importantly: because the number of total nodes is not asymptotically higher than the number of black nodes you can, by analyzing the complexity purely with regards to the total number of black nodes, implicitly analyze the complexity with regards to the total number of nodes.
So rather than using T(n)<=2T(2n/3)+O(1) you should use T(m)<=T(m/2)+O(1) where m is the number of black nodes which gives you O(m) and because, as previously discussed, O(m)==O(n), we have O(n).
Another way to think about it: So long as you can understand that this algorithm is O(n) when all the nodes in the tree are black, you should be able to understand that it could only possibly require fewer operations if some of the nodes in the tree are red, since regardless of where the red node is every node in the subtree rooted at that red node will be ignored and not visited by this recursive algorithm. So it can only be O(n) or better, establishing O(n) as your worst case.

Why the Red Black Tree is kept unbalanced after insertion?

Here is a red black tree which seems unbalanced. If this is the case, Someone please explain why it is unbalanced?.
The term "balanced" is a bit ambiguous, since different kinds of balanced trees have different constraints.
A red-black tree ensures that every path to a leaf has the same number of black nodes, and at least as many black nodes as red nodes. The result is that the longest path is at most twice as long as the shortest path, which is good enough to guarantee O(log N) time for search, insert, and delete operations.
Most other kinds of balanced trees have tighter balancing constraints. An AVL tree, for example, ensures that the lengths of the longest paths on either side of every node differ by at most 1. This is more than you need, and that has costs -- inserting or deleting in an AVL tree (after finding the target node) takes O(log N) operations on average, while inserting or deleting in a red-black tree takes O(1) operations on average.
If you wanted to keep a tree completely balanced, so that you had the same number of descendents on either side of every node, +/- 1, it would be very expensive -- insert and delete operations would take O(N) time.
Yes it is balanced. The rule says, counting the black NIL leaves, the longest possible path should consists maximum of 2*B-1 nodes where B is black nodes in shortest possible path from the root to any leaf. In your example shortest path has 2 black nodes so B = 2 so longest path can have upto 3 black nodes but it is just 2.

Maximum number of rotations to recover height balanced property in an AVL tree

If a new node is inserted at depth d into an AVL tree, what is the maximum number of rotations that may be required to recover the height balanced property?
I had guessed it may be log2(d) for the maximum, but that is not correct.
If you adopt the naive approach, you can normalize and round up the fundamental AVL balance
{left,even,right} ~ {down,even,up} ~ {green,green,red}
and clear the route all green before the insertion.
Each red-light requires at most 2 rotations, which does not include the subtree cost. The problem is that within the 4 quad-subtrees rotated, there is 1 subtree height that remains invariant.
If the next light is the green light, you need only 2 rotations for each red-light (which excludes the subtree rotation cost.)
In order to move down a new green-light from the tree-top, you have to rotate the red-lights one by one from the tree-top.
On insertion, only at most 1 rotation is needed. You can read about it at
https://en.wikipedia.org/wiki/AVL_tree
On deletion, it is possible that a rotation is needed on every level of the tree, so deletion is O(log2(d))

Red Black Tree - max number of rotations needed for K insertions and K deletions?

What's the maximum number of rotations required after K insertions and K deletions in a Red Black tree?
I'm thinking its 3K as in the worst case scenario for insertion we perform 2 rotations for every insertion and 1 rotation for every deletion.
Am i on the right track here?
In contrast to AVL trees where rotations for deletions may propagate up to the root (although having at most one (double-)rotation for insert), RB trees require a constant (at most 2 for insert, at most 3 for deletion) number of rotations. What can take logarithmically much time during deletion in an RB tree is the recoloring which may propagate up to the root, which means insert and delete have the same asymptotics for both AVL and RB trees.
(If interested, you can find an analysis of these things in this script.)
Regarding your question, at most 3K is correct (but apparently rotations are counted a little differently from the linked script).

Largest and smallest number of internal nodes in red-black tree?

The smallest number of internal nodes in a red-black tree with black height of k is 2k-1 which is one in the following image:
The largest number of internal nodes with black height of k is 22k-1 which, if the black height is 2, should be 24 - 1 = 15. However, consider this image:
The number of internal nodes is 7. What am I doing wrong?
(I've completely rewritten this answer because, as the commenters noted, it was initially incorrect.)
I think it might help to think about this problem by using the isometry between red-black trees and 2-3-4 trees. Specifically, a red-black tree with black height h corresponds to a 2-3-4 tree with height h, where each red node corresponds to a key in a multi-key node.
This connection makes it easier for us to make a few neat observations. First, any 2-3-4 tree node in the bottom layer corresponds to a black node with either no red children, one red child, or two red children. These are the only nodes that can be leaf nodes in the red-black tree. If we wanted to maximize the number of total nodes in the tree, we'd want to make the 2-3-4 tree have nothing but 4-nodes, which (under the isometry) maps to a red/black tree where every black node has two red children. An interesting effect of this is that it makes the tree layer colors alternate between black and red, with the top layer (containing the root) being black.
Essentially, this boils down to counting the number of internal nodes in a complete binary tree of height 2h - 1 (2h layers alternating between black and red). This is equal to the number of nodes in a complete binary tree of height 2h - 2 (since if you pull off all the leaves, you're left with a complete tree of height one less than what you started with). This works out to 22h - 1 - 1, which differs from the number that you were given (which I'm now convinced is incorrect) but matches the number that you're getting.
You need to count the black NIL leafs in the tree if not this formula won't work. The root must not be RED that is in violation of one of the properties of a Red-Black tree.
The problem is you misunderstood the black height.
The black height of a node in a red-black tree is the the number of black nodes from the current node to a leaf not counting the current node. (This will be the same value in every route).
So if you just add two black leafs to every red node you will get a red-black tree with a black height of 2 and 15 internal nodes.
(Also in a red-black tree every red node has two black children so red nodes can't be leafs.)
After reading the discussion above,so if I add the root with red attribute, the second node I add will be a red again which would be a red violation, and after node restructuring, I assume that we again reach root black and child red ! with which we might not get (2^2k)-1 max internal nodes.
Am I missing something here , started working on rbt just recently ...
It seems you havent considered the "Black Leaves" (Black nodes) -- the 2 NIL nodes for each of the Red Nodes on the last level. If you consider the NIL nodes as leaves, the Red nodes on the last level now get counted as internal nodes totaling to 15.
The tree given here actually has 15 internal nodes. The NIL black children of red nodes in last layer are missing which are actually called external nodes ( node without a key ). The tree has black-height of 2. The actual expression for maximum number of internal nodes for a tree with black-height k is 4^(k)-1. In this case, it turns out to be 15.
In red-black trees, external nodes[null nodes] are always black but in your question for the second tree you have not mentioned external nodes and hence you are getting your count as 7 but if u mention external nodes[null nodes] and then count internal nodes you can see that it turns out to be 15.
Not sure that i understand the question.
For any binary tree where all layers (except maybe last one) have max number of items we will have 2^(k-1)-1 internal nodes, where k is number of layers. At second picture you have 4 layers, so number of internal nodes is 2^(4-1)-1=7

Resources