Are all trees in Fibonacci heaps binomial trees? - data-structures

Is it possible for a Fibonacci heap to contain a tree that isn't a binomial tree? If so, how would this happen? Can you give an example?

Yes, this can happen. Intuitively, the reason is that in a Fibonacci heap, the decrease-key operation can work by cutting a subtree from a larger tree, resulting in two trees that are (potentially) not binomial trees. This differs from the binomial heap, where decrease-key works by doing a bubble-up operation from the node whose key was decreased all the way up to the root.
To see a concrete example, let's insert five elements into a Fibonacci heap, say, 1, 3, 5, 7, and 9. This gives the heap
1 - 3 - 5 - 7 - 9
Now, let's do a dequeue-min, which extracts 1. We now try to compact all of the remaining elements together, which merges the trees as follows:
3
/|
5 7
|
9
Now, suppose that we do a decrease-key operation on to decrease the key of 9 to 6. To do this, we cut 9 from its parent and merge it into the list of trees at the top, which yields
3 - 6
/|
5 7
And now the tree with 3 at its root contains only 3 elements, so it is not a binomial tree anymore.
Hope this helps!

Related

What can a binary heap do that a binary search tree cannot?

This is something I do not quite understand. When I read literature on heaps, it always says that the big advantage of a heap is that you have the top (max if max heap) element immediately available. But couldn't you just use a BST and store a pointer to the same node (bottom-rightmost) and update the pointer with insertions/deletions?
If I'm not mistaken, with the BST implementation I'm describing you would have
================================================
| Insert | Remove Max
================================================
Special BST | O(log(n)) | O(1)
================================================
Max Heap | O(log(n)) | O(log(n))
================================================
making it better.
Pseudo-code:
Insert:
Same as regular BST insert, but can keep track of whether
item inserted is max because traversal will be entirely
in the right direction.
Delete
Set parent of max equal to null. Done.
What am I missing here?
But couldn't you just use a BST and store a pointer to the same node (bottom-rightmost) and update the pointer with insertions/deletions?
Yes, you could.
with the BST implementation I'm describing you would have [...] Remove Max O(1) [...] making it better.
[...] Set parent of max equal to null. Done.
No, Max removal wouldn't (always) be O(1), for the following reasons:
After you have removed the Max, you need to also update the pointer to reference the bottom right-most node. For example, take this tree, before the Max is removed:
8
/ \
5 20 <-- Max pointer
/ /
2 12
/ \
10 13
\
14
You'll have to find the node with value 14, so to update the Max pointer.
The above operation can be made to be O(1), by keeping the tree balanced, let's say according to the AVL rules. In that case the left child of the previous Max node would not have a right child, and the new Max node would be either its left child, or if it didn't have one, its parent. But as some deletions will make the tree unbalanced, they would need to be followed by a rebalancing operation. And that may involve several rotations. For instance, take this balanced BST:
8
/ \
5 13
/ \ / \
2 6 9 15 <-- Max pointer
/ \ \ \
1 4 7 10
/
3
After removal of node 15, it is easy to determine that 13 is the next Max, but the subtree rooted at 13 would not be balanced. After balancing it, the tree as a whole is unbalanced, and another rotation would be needed. The number of rotations could be O(logn).
Concluding, you can use a balanced BST with a Max pointer, but extraction of the Max node is still a O(logn) operation, giving it the same time complexity as the same operation in a binary heap.
What can a binary heap do that a binary search tree cannot?
Considering that a binary heap uses no pointers, and thus has much less "administrative" overhead than a self-balancing BST, the actual space consumption and runtime of the insert/delete operations will be better by a factor -- while their asymptotic complexity is the same.
Also, a binary heap can be built from a non-sorted array in O(n) time, while building a BST costs O(nlogn).
However, a BST is the way to go when you need to be able to traverse the values in their proper order, or find a value, or find a value's predecessor/successor. A binary heap has worse time complexities for such operations.
Both max heaps and balanced BST’S (eg AVL trees) perform these operations in O(log n) time. But BST’s take a constant factor more space due to pointers and their code is more complicated.
Since you're talking about BST's and not Balanced BST's, consider the following skewed BST:
1
\
2
\
3
\
...
\
n
You can hold a pointer reference to the max (n-th) element, but if you're inserting a value < n, it will require O(n) insertion time in the worst case. Also, to see the max value in the heap, you could simply do heap[0] (assuming the heap is implemented using an array) to get the max element in O(1) time for heap as well.

Skewed binary tree vs Perfect binary tree - space complexity

Does a skewed binary tree take more space than, say, a perfect binary tree ?
I was solving the question #654 - Maximum Binary Tree on Leetcode, where given an array you gotta make a binary tree such that, the root is the maximum number in the array and the right and left sub-tree are made on the same principle by the sub-array on the right and left of the max number, and there its concluded that in average and best case(perfect binary tree) the space taken would be O(log(n)), and worst case(skewed binary tree) would be O(n).
For example, given nums = [1,3,2,7,4,6,5],
the tree would be as such,
7
/ \
3 6
/ \ / \
1 2 4 5
and if given nums = [7,6,5,4,3,2,1],
the tree would be as such,
7
\
6
/ \
5
/ \
4
/ \
3
/ \
2
/ \
1
According to my understanding they both should take O(n) space, since they both have n nodes. So i don't understand how they come to that conclusion.
Thanks in advance.
https://leetcode.com/problems/maximum-binary-tree/solution/
Under "Space complexity," it says:
Space complexity : O(n). The size of the set can grow upto n in the worst case. In the average case, the size will be nlogn for n elements in nums, giving an average case complexity of O(logn).
It's poorly worded, but it is correct. It's talking about the amount of memory required during construction of the tree, not the amount of memory that the tree itself occupies. As you correctly pointed out, the tree itself will occupy O(n) space, regardless if it's balanced or degenerate.
Consider the array [1,2,3,4,5,6,7]. You want the root to be the highest number, and the left to be everything that's to the left of the highest number in the array. Since the array is in ascending order, what happens is that you extract the 7 for the root, and then make a recursive call to construct the left subtree. Then you extract the 6 and make another recursive call to construct that node's left subtree. You continue making recursive calls until you place the 1. In all, you have six nested recursive calls: O(n).
Now look what happens if your initial array is [1,3,2,7,5,6,4]. You first place the 7, then make a recursive call with the subarray [1,3,2]. Then you place the 3 and make a recursive call to place the 1. Your tree is:
7
3
1
At this point, your call depth is 2. You return and place the 2. Then return from the two recursive calls. The tree is now:
7
3
1 2
Constructing the right subtree also requires a call depth of 2. At no point is the call depth more than two. That's O(log n).
It turns out that the call stack depth is the same as the tree's height. The height of a perfect tree is O(log n), and the height of a degenerate tree is O(n).

Can I achieve begin insertion on a binary tree in O(log(N))?

Consider a binary tree and some traverse criterion that defines an ordering of the tree's elements.
Does it exists some particular traverse criterion that would allow a begin_insert operation, i.e. the operation of adding a new element that would be at position 1 according to the ordering induced by the traverse criterion, with O(log(N)) cost?
I don't have any strict requirement, like the tree guaranteed to be balanced.
EDIT:
But I cannot accept lack of balance if that allows degeneration to O(N) in worst case scenarios.
EXAMPLE:
Let's try to see if in-order traversal would work.
Consider the BT (not a binary search tree)
6
/ \
13 5
/ \ /
2 8 9
In-order traversal gives 2-13-8-6-9-5
Perform begin_insert(7) in such a way that in-order traversal gives 7-2-13-8-6-9-5:
6
/ \
13 5
/ \ /
2 8 9
/
7
Now, I think this is not a legitimate O(log(N)) strategy, because if I keep adding values in this way the cost degenerates into O(N) as the tree becomes increasingly unbalanced
6
/ \
13 5
/ \ /
2 8 9
/
7
/
*
/
*
/
This strategy would work if I rebalance the tree by preserving ordering:
8
/ \
2 9
/ \ / \
7 13 6 5
but this costs at least O(N).
According to this example my conclusion would be that in-order traversal does not solve the problem, but since I received feedback that it should work maybe I am missing something?
Inserting, deleting and finding in a binary tree all rely on the same search algorithm to find the right position to do the operation. The complexity of this O(max height of the tree). The reason is that to find the right location you start at the root node and compare keys to decide if you should go into the left subtree or the right subtree and you do this until you find the right location. The worst case is when you have to travel down the longest chain which is also the definition for height of the tree.
If you don't have any constraints and allow any tree then this is going to be O(N) since you allow a tree with only left children (for example).
If you want to get better guarantees you must use algorithms that promise that the height of the tree has a lower bound. For example AVL guarantees that your tree is balanced so the max height is always log N and all the operations above run in O(log N). Red-black trees don't guarantee log N but promise that the tree is not going to be too unbalanced (min height * 2 >= max height) so it keeps O(log N) complexity (see page for details).
Depending on your usage patterns you might be able to find more specialized data structures that give even better complexity (see Fibonacci heap).

Insertion into a Binary Heap: Number of exchanges in worst case

I was going through Cormen's 'Algorithms Unlocked'. In chapter 6 on shortest path algorithms, on inserting data into a binary heap, I find this: "Since the path to the root has at most floor(lg(n)) edges, at most floor(lg(n))-1 exchanges occur, and so INSERT takes O(lg(n)) time." Now, I know the resulting complexity of insertion in a binary heap is as mentioned, but about the number of exchanges in the worst case, should it not be floor(lg(n)) instead of floor(lg(n))-1. The book's errata says nothing regarding this. So I was wondering if I missed something.
Thanks and Regards,
Aditya
You can easily show that it's floor(lg(n)). Consider this binary heap:
3
5 7
To insert the value 1, you first add it to the end of the heap:
3
5 7
1
So there are 4 items in the heap. It's going to take two swaps to move the item 1 to the root. floor(lg(4)) is equal to 2.
floor(lg(n)) is the correct expression for the maximum number of edges on a path between a leaf and the root, and when you do swaps, you may end up doing one swap for each edge. So floor(lg(n)) is the correct answer for the worst-case number of swaps. The author most likely confused the number of edges on the path with the number of VERTICES on the path when they were writing. If you have V vertices on the path between the leaf and the root, then V-1 is the number of edges so V-1 is the number of swaps you might do in the worst-case.

Big O(h) vs. Big O(logn) in trees

I have a question on time complex in trees operations.
It's said that (Data Structures, Horowitz et al) time complexity for insertion, deletion, search, finding mins-maxs, successor and predecessor nodes in BSTs is of O(h) while those of AVLs makes O(logn).
I don't exactly understand what the difference is. With h=[logn]+1 in mind, so why do we say O(h) and somewhere else O(logn)?
h is the height of the tree. It is always Omega(logn) [not asymptotically smaller then logn]. It can be very close to logn in complete tree (then you really get h=logn+1, but in a tree that decayed to a chain (each node has only one son) it is O(n).
For balanced trees, h=O(logn) (and in fact it is Theta(logn)), so any O(h) algorithm on those is actually O(logn).
The idea of self balancing search trees (and AVL is one of them) is to prevent the cases where the tree decays to a chain (or somewhere close to it), and its (the balanced tree) features ensures us O(logn) height.
EDIT:
To understand this issue better consider the next two trees (and forgive me for being terrible ascii artist):
tree 1 tree 2
7
/
6
/
5 4
/ / \
4 2 6
/ / \ / \
3 1 3 5 7
/
2
/
1
Both are valid Binary search trees, and in both searching for an element (say 1) will be O(h). But in the first, O(h) is actually O(n), while in the second it is O(logn)
O(h) means complexity linear dependent on tree height. If tree is balanced this asymptotic becomes O(logn) (n - number of elements). But it is not true for all trees. Imagine very unbalanced binary tree where each node has only left child, this tree become similar to list and number of elements in that tree equal to height of tree. Complexity for described operation will be O(n) instead of O(logn)

Resources