Best way to join two red-black trees - algorithm

The easiest way is to store two trees in two arrays, merge them and build a new red-black tree with a sorted array which takes O(m + n) times.
Is there an algorithm with less time complexity?

You can merge two red-black trees in time O(m log(n/m + 1)) where n and m are the input sizes and, WLOG, m ≤ n. Notice that this bound is tighter than O(m+n). Here's some intuition:
When the two trees are similar in size (m ≈ n), the bound is approximately O(m) = O(n) = O(n + m).
When one tree is significantly larger than the other (m ≪ n), the bound is approximately O(log n).
You can find a brief description of the algorithm here. A more in-depth description which generalizes to other balancing schemes (AVL, BB[α], Treap, ...) can be found in a recent paper.

I think that if you have a generic Sets (so generic red-black tree) you can't choose the solution which was suggested #Sam Westrick. Because he assumes that all elements in the first set are less then the elements in the second set. Also into the Cormen (the best book to learn algorithm and data structures) specifies this condition to join two red-black tree.

Due to the fact that you need to compare each element in both m and n Red-Black Trees, you will have to deal with a minimum of O(m+n) time complexity, there's a way to do it O(1) space complexity, but that is something else which has nothing to do with your qu. if you are not iterating and checking each element in each Red-Black Tree, you cannot guarantee that your new Red-Black Tree will be sorted. I can think of another way of merging two Red-Black Trees, which called "In-Place Merge using DLL", but this one would also result O(m+n) time complexity.
Convert the given two Red-Black Trees into Doubly Linked List, which has O(m+n) time complexity.
Merge the two sorted Linked Lists, which has O(m+n) time complexity.
Build a Balanced Red-Black Tree from the merged list created in step 2, which has O(m+n) time complexity.
Time complexity of this method is also O(m+n).
So due to the fact you have to compare the elements each Tree with the other elements of the other Tree, you will have to end up with at least O(m+n).

Related

Tree Sort Performance

I have an AVL tree implementation where the insertion method runs in O(log n) time and the method that returns an in-order list representation runs in O(n^2) time. If I have a list needed to be sorted. By using a for-loop, I can iterate through the list and insert each element into the AVL tree, which will run in O(n log n) time combined. So what is the performance of this entire sorting algorithm (i.e. iterate through the list, insert each element, then use in-order traversal to return a sorted list)?
You correctly say that adding n elements to the tree will take time O(nlog(n)). A simple in-order traversal of a BST can be performed in time O(n). It is thus possible to get a sorted list of the elements in time O(nlog(n) + n) = O(nlog(n)). If the time complexity of your algorithm to generate the sorted list from the tree is quadratic (i.e. in O(n^2) but not always in O(n)) the worst case time complexity of the procedure you describe is in O(nlog(n) + n^2) = O(n^2), which is not optimal.

n(logn) Algorithms problems

So I was just finishing up reading on time complexity, and I ran into a few questions that I can't solve.
"You’re given an array of n integers, and must answer a series of n
queries, each of the form: “how many elements of the array have value between
L and R?”, where L and R are integers. Design an O(n log n) algorithm that
answers all of these queries."
Thanks.
You can use segment-tree.
Or a balanced tree such that avl and store in each node its left and right children. In each query find the L and R in O(logn) and calculate the mid-renge nodes. Construction takes O(n logn). Also for n queries you reach O(n logn).
Or a BIT with a approach same as second idea.

BinaryTree inOrder Traversal Sorting complexity

I am confused to why quicksort, shellsort, mergesort...all O(nlog(n)) algorithms repeatedly mentioned as popular sorting algorithms, Doesn't inorder traversal of a binarysearch tree give O(n) complexity to sort a tree? What am I missing?
No. Building the tree has O(N log N) complexity (i.e., you're inserting N items into the tree, and each insertion has logarithmic complexity).
Once you have the tree built, you can traverse it with linear complexity (especially if it's a threaded tree), but that's not equivalent to sorting--it's equivalent to traversing an array after you've sorted it.
Although they have the same asymptotic complexity, building a tree will usually be slower by a substantial factor, because you have to allocate nodes for the tree and traverse non-contiguously allocated nodes to walk the tree.
Basically there are 6 types of complexity.
O(1),O(logn),O(n),O(nlogn),O(n^2),O(n^3).
For first part of your question why popular sorting algorithm are having O(nlogn) complexity its simply because we cannot sort an array in O(n) complexity.
Its because for O(n) complexity you want to sort the array in only one loop.
That is not possible since we cannot sort an array in one strech.
So the next possible complexity is O(nlogn). That is dividing and conquer method for example in merge sort
We find the middle element sort each side of that recursively. for recursion since it decreasing size to half each time the complexity is o(logn). For sorting part it make the complexity to O(nlogn).
For Next part of your question remember the fact that all basic operation like insertion,deletion in a BST is having the complexity of O(logn) where logn is the height of the tree.
So if you sort a tree using o(n) complexity that make it O(nlogn) in total.
Comment me if you didnt got my point. This is the simple way to answer it i think.
Any traversals of a binary tree is O(n). But that is not sorting.
Having a BST meaning that is already a sorted tree. You are just traversing it. Building a BST is the sorting process which is asymptotically O(nlog(n)).
Please note O(n) + O(nlog(n)) is same as O(nlog(n)).

LSM Tree lookup time

What's the worst case time complexity in a log-structured merge tree for a simple search query (like querying a single WHERE clause)?
Is it O(log N)? O(N*Log N)? Something else?
How about for a multiple query, like searching for multiple WHERE clauses in a key-value database?
The wikipedia page on LSM trees is currently lacking this info.
And I'm trying to make sense of the original paper.
I have been wondering the same.
If you have a series of trees, getting smaller by a constant factor each time, and you need to search them all for a single key, the cost seems to be O(log(N)^2).
Say the first (binary) tree takes log_2(N) branches to reach a node. The second might be half the size, and take (log_2(N) - 1) branches to find a node. The smallest tree will be some O(1) constant in size and there are roughly log_2(N) trees total. Summing the series gives O(log_2(N)^2).
However, I'm wondering if there is some more clever scheme where arbitrary single-key lookups, insertions or deletions have amortized cost O(log(N)), but haven't been able to find an answer (yet).
For a simple search indexed by a LSM tree, it is O(log n). This is because the biggest tree in the LSM tree is a B tree, which is O(log n), and the other trees are subsets of B trees or in the case of in memory trees, more efficient trees, which are no worse than O(log n). The number of trees is a constant, so it doesn't affect the order of the search time.

Is there a fast algorithm to merge sorted B+Trees?

I'm writing a dbm style database manager with immutable B+Trees as the storage medium (see http://sf.net/projects/aodbm/ ). Is there a fast algorithm for merging two B+Trees (where the trees potentially share nodes)?
it is omega(n) problem.
proof: assume it had better complexity O(d) and d < n. you can arrange a B+ tree as an array, and then merging the two trees will be merging two arrays. if so, you could do merge(A1,A2) in O(d), and using mergesort could be O(dlog(n)) - contradiction.
a O(n) solution (actually it will be of course theta(n)):
1.flat T1 and T2 into sorted arrays A1,A2.
2.use A <- merge(A1,A2)
3. build an empty "almost full" tree T with exactly |T1|+|T2| 'places'.
4. fill in T with A (in-order search)
5. result is T.
complexity:
step 1 is O(n) (in order search)
step 2 is O(n) merge`s complexity (since A1,A2 are both sorted)
steps 3+4 are trivially O(n) as well

Resources