I have an AVL tree implementation where the insertion method runs in O(log n) time and the method that returns an in-order list representation runs in O(n^2) time. If I have a list needed to be sorted. By using a for-loop, I can iterate through the list and insert each element into the AVL tree, which will run in O(n log n) time combined. So what is the performance of this entire sorting algorithm (i.e. iterate through the list, insert each element, then use in-order traversal to return a sorted list)?
You correctly say that adding n elements to the tree will take time O(nlog(n)). A simple in-order traversal of a BST can be performed in time O(n). It is thus possible to get a sorted list of the elements in time O(nlog(n) + n) = O(nlog(n)). If the time complexity of your algorithm to generate the sorted list from the tree is quadratic (i.e. in O(n^2) but not always in O(n)) the worst case time complexity of the procedure you describe is in O(nlog(n) + n^2) = O(n^2), which is not optimal.
Related
The easiest way is to store two trees in two arrays, merge them and build a new red-black tree with a sorted array which takes O(m + n) times.
Is there an algorithm with less time complexity?
You can merge two red-black trees in time O(m log(n/m + 1)) where n and m are the input sizes and, WLOG, m ≤ n. Notice that this bound is tighter than O(m+n). Here's some intuition:
When the two trees are similar in size (m ≈ n), the bound is approximately O(m) = O(n) = O(n + m).
When one tree is significantly larger than the other (m ≪ n), the bound is approximately O(log n).
You can find a brief description of the algorithm here. A more in-depth description which generalizes to other balancing schemes (AVL, BB[α], Treap, ...) can be found in a recent paper.
I think that if you have a generic Sets (so generic red-black tree) you can't choose the solution which was suggested #Sam Westrick. Because he assumes that all elements in the first set are less then the elements in the second set. Also into the Cormen (the best book to learn algorithm and data structures) specifies this condition to join two red-black tree.
Due to the fact that you need to compare each element in both m and n Red-Black Trees, you will have to deal with a minimum of O(m+n) time complexity, there's a way to do it O(1) space complexity, but that is something else which has nothing to do with your qu. if you are not iterating and checking each element in each Red-Black Tree, you cannot guarantee that your new Red-Black Tree will be sorted. I can think of another way of merging two Red-Black Trees, which called "In-Place Merge using DLL", but this one would also result O(m+n) time complexity.
Convert the given two Red-Black Trees into Doubly Linked List, which has O(m+n) time complexity.
Merge the two sorted Linked Lists, which has O(m+n) time complexity.
Build a Balanced Red-Black Tree from the merged list created in step 2, which has O(m+n) time complexity.
Time complexity of this method is also O(m+n).
So due to the fact you have to compare the elements each Tree with the other elements of the other Tree, you will have to end up with at least O(m+n).
What would be the worst case time complexity to build binary search tree with given arbitrary N elements ?
I think there is a difference between N given elements and the elements coming one by one and thereby making a BST of total N elements .
In the former case, it is O(n log n) and in second one is O(n^2) . Am i right ?
If Binary Search Tree (BST) is not perfectly balanced, then the worst case time complexity is O(n^2). Generally, BST is build by repeated insertion, so worst case will be O(n^2). But if you can sort the input (in O(nlogn)), it can be built in O(n), resulting in overall complexity of O(nlogn)
It BST is self-balancing, then the worst case time complexity is O(nlog n) even if we have repeated insertion.
Consider the scenario where data to be inserted in an array is always in order, i.e. (1, 5, 12, 20, ...)/A[i] >= A[i-1] or (1000, 900, 20, 1, -2, ...)/A[i] <= A[i-1].
To support such a dataset, is it more efficient to have a binary search tree or an array.
(Side note: I am just trying to run some naive analysis for a timed hash map of type (K, T, V) and the time is always in order. I am debating using Map<K, BST<T,V>> vs Map<K, Array<T,V>>.)
As I understand, the following costs (worst case) apply—
Array BST
Space O(n) O(n)
Search O(log n) O(n)
Max/Min O(1) O(1) *
Insert O(1) ** O(n)
Delete O(n) O(n)
*: Max/Min pointers
**: Amortized time complexity
Q: I want to be more clear about the question. What kind of data structure should I be using for such a scenario between these two? Please feel free to discuss other data structures like self balancing BSTs, etc.
EDIT:
Please note I didn't consider the complexity for a balanced binary search tree (RBTree, etc). As mentioned, a naive analysis using a binary search tree.
Deletion has been updated to O(n) (didn't consider time to search the node).
Max/Min for skewed BST will cost O(n). But it's also possible to store pointers for Max & Min so overall time complexity will be O(1).
See below the table which will help you choose. Note that I am assuming 2 things:
1) data will always come in sorted order - you mentioned this i.e. if 1000 is the last data inserted, new data will always be more than 1000 - if data does not come in sorted order, insertion can take O(log n), but deletion will not change
2) your "array" is actually similar to java.util.ArrayList. In short, its length is mutable. (it is actually unfair compare a mutable and an immutable data structure) However, if it is a normal array, your deletion will take amortized O(log n) {O(log n) to search and O(1) to delete, amortized if you need to create new array} and insertion will take amortized O(1) {you need to create new array}
ArrayList BST
Space O(n) O(n)
Search O(log n) O(log n) {optimized from O(n)}
Max/Min O(1) O(log n) {instead of O(1) - you need to traverse till the leaf}
Insert O(1) O(log n) {optimized from O(n)}
Delete O(log n) O(log n)
So, based on this, ArrayList seems better
What will be time Complexity for Displaying data in alphabetical order using skip-list ?
and what time complexity will be for skip list if we implement using quad node ?
Let's assume that you have the input that contains N elements. Firstly you have to construct a skip-list. The complexity of a single insert operation is O(log N) in average so the complexity of inserting N elements would be O(N * log N). When the skip-list is constructed then elements within this list are sorted. So in order to enumerate them you only need to visit each element what is O(N).
It is worth saying that the skip-list is based on randomness. It means that O(log N) complexity of a single insert operation is not guaranteed. The worst case complexity is O(N), It means that in the worst case the complexity of inserting N elements into the skip-list would be O(N^2).
I am confused to why quicksort, shellsort, mergesort...all O(nlog(n)) algorithms repeatedly mentioned as popular sorting algorithms, Doesn't inorder traversal of a binarysearch tree give O(n) complexity to sort a tree? What am I missing?
No. Building the tree has O(N log N) complexity (i.e., you're inserting N items into the tree, and each insertion has logarithmic complexity).
Once you have the tree built, you can traverse it with linear complexity (especially if it's a threaded tree), but that's not equivalent to sorting--it's equivalent to traversing an array after you've sorted it.
Although they have the same asymptotic complexity, building a tree will usually be slower by a substantial factor, because you have to allocate nodes for the tree and traverse non-contiguously allocated nodes to walk the tree.
Basically there are 6 types of complexity.
O(1),O(logn),O(n),O(nlogn),O(n^2),O(n^3).
For first part of your question why popular sorting algorithm are having O(nlogn) complexity its simply because we cannot sort an array in O(n) complexity.
Its because for O(n) complexity you want to sort the array in only one loop.
That is not possible since we cannot sort an array in one strech.
So the next possible complexity is O(nlogn). That is dividing and conquer method for example in merge sort
We find the middle element sort each side of that recursively. for recursion since it decreasing size to half each time the complexity is o(logn). For sorting part it make the complexity to O(nlogn).
For Next part of your question remember the fact that all basic operation like insertion,deletion in a BST is having the complexity of O(logn) where logn is the height of the tree.
So if you sort a tree using o(n) complexity that make it O(nlogn) in total.
Comment me if you didnt got my point. This is the simple way to answer it i think.
Any traversals of a binary tree is O(n). But that is not sorting.
Having a BST meaning that is already a sorted tree. You are just traversing it. Building a BST is the sorting process which is asymptotically O(nlog(n)).
Please note O(n) + O(nlog(n)) is same as O(nlog(n)).