Threaded binary tree is effective since it doesn't require any recursion or stack to traverse. My doubt is it makes every insertion takes O(n) (where n is number of nodes in the tree) since for every node we insert it has to be threaded again, isn't it? If I'm right then threaded binary trees are practically ineffective, isn't it?
As the Wikipedia article says:
"A binary tree is threaded by making all right child pointers that
would normally be null point to the inorder successor of the node (if
it exists), and all left child pointers that would normally be null
point to the inorder predecessor of the node."
The key here is "that would normally be null."
Typically, you either include additional fields for the thread links, or you use a bit flag to determine whether the left and right nodes are children or inorder successor/predecessor.
Because the interior node pointers are still the traditional left/right binary tree node references, you can use the standard recursive search to find an insertion spot in O(log n) time.
Related
The OCaml reference manual provides an example of priority queue implementation.
It's graph-based implementation of a heap. I said 'heap' because each node has 0, 1 or 2 children and parent node is less than or equals than its children. However, it's not a 'binary heap' as the insertion algorithm doesn't force leaves to be left-most aligned (as it should be according to Wikipedia definition), so the tree isn't complete.
My intuition is that the tree is balanced though, as each time we insert a new node: the left sub-tree moves to the right sub-tree and the previous right sub-tree gets the node added and become new left sub-tree. The following insertion will move the previously called 'new right sub-tree' to the left and gets the node added.
So the depth of the left sub-tree never differs more than 1 from the depth of the right sub-tree, so the tree is balanced. Hence we should never end up in a tree having a linked-list form and worst case complexity should remain O(log n) - while the insertion algorithm is way simpler, as it doesn't take care of keeping the tree complete (but only balanced).
Is my intuition correct here? I make some research and didn't find out this algorithm elsewhere (instead most algorithm focus on array-based implementation, which obviously require a complete tree, otherwise some slots could be 'invalid').
Thanks
You are correct about the way that the heap maintains balance during inserts.
The removeMin operation, however, can disturb the balance, because all the left can be lower than all the elements on the right, for example. There is nothing to restore the balance, and so the balance may be lost.
So this heap does not provide any O(log N) guarantee, if N is the size of the heap. It does, though, if N is the total number of inserts, and that's not too bad. It doesn't hurt the complexity of most algorithms that use heaps.
In a binary max heap implemented as a binary tree (where each node stores a pointer to its parent, left child, and right child), if you have the pointer to the root of the heap, how would you implement an insert operation? What's supposed to happen is the node first gets inserted as the last element in the last row. For array based, you could append to the array, but for tree based implementation, how would you find the right spot?
In this older question, I gave a short algorithm that uses the binary representation of the number k in order to find a way to select the k-th node out of a binary heap in a top-down traversal. Assuming that you keep track of the number of nodes in the explicit tree representation of the binary heap, you could do the following to do an insert operation:
Using the above algorithm, determine where the new node should go, then insert the node at that position.
Continuously bubble the node upward either by rewiring the tree to swap it with its parent or by exchanging the data fields of the node and its parent until the element is in its final position.
Hope this helps!
If you hang you new vertex under any leaf of your tree (as left or right successor, doesn't matter), and then repair the heap from this new vertex to the top (that is, regarding every other vertex with successors, swap it with the greater successor and climb up if needed), your new element will find it's rightful place without breaking the heap. However, this will only guarantee you that every other insert operation will take O(h) time, where h is the maximum height of the tree.
It's better to represent heap as an array, obviously, because that way it's guaranteed that every insert operation will take O(logN) time.
To find the exact location as to where the new node is supposed to be inserted, we use the binary representation of the Binary Heap's Size. This takes O(log N) and then we bubble it up which takes O(log N). So the insertion operation takes O(log N)... For a detailed explanation check out my blog's post on Binary Heaps -
http://theoryofprogramming.com/2015/02/01/binary-heaps-and-heapsort-algorithm/
I hope it helped you, if it did, let me know...! ☺
I'm studying how to balance trees and I have some questions
Is it possible to balance a normal binary tree? If yes, which algorithm should be used?
Do I necessarily have to use a AVL or Red-black tree to obtain a balanced tree? How do these work?
I read something about rotations, weights but I'm kind of confused right now
Is it possible to balance a normal binary tree? If yes, which
algorithm should be used?
In O(n) you can build a complete tree, and populate it with the elements in in-order traversal.
It cannot be done better, because A BST might in rare cases decay to a chain (linked list), where all nodes have one son as null. In this cases, accessing the element in the middle is O(n) itself.
Do I necessarily have to use a AVL or Red-black tree to obtain a
balanced tree?
There are other balanced trees such as B+ trees, and other data structures (not trees) such as skip-lists. You might want to have a look at a list of known data structures, especially the trees section.
How do these work?
I find the wikipedia articles both on AVL tree and Red-Black tree very informative. If you have something specific you don't understand there - you should ask.
Also: Trying to implement a balanced trees on your own (Implement a known tree, not inventing a new one - of course) - is great for educational purposes, and by doing so - you will definitely understand how it works.
Well... AVL and red-black trees are "normal binary trees" that are balanced, and keep that balance (for some definition of "balanced"). I'm not a computer science teacher to come up with my own explanation of the algorithms, and I guess you aren't looking for a cut&paste from Wikipedia :-)
Now, for balancing binary trees: if the tree is a search tree (i.e. 'sorted', but 'balanced' doesn't really make all that much sense if it's not) you could always just recreate the tree. The simplest algorithm is to use an array with all the elements from the tree, in sorted order (easily obtained from an inorder traversal). Then build an algorithm around this general idea:
take the middle element of the array as the root of the tree. This will create a tree node, and two arrays "left" and "right", which are meant to form the left and right subtrees
Apply this same algorithm recursively to create a tree from the "left" array and one from the "right" array. These two trees become the children of the parent node.
You might have to be careful with the case when the array has an even number of elements: there is no obvious "middle element", and removing one of the two candidates will create arrays of different sizes. I'm too lazy to analyze this further to see if that could offset the whole balancing thing.
Of course, doing something like this every time you change the tree isn't such a great idea; you really want to use self-balancing trees like AVL for that. Doing it after creating the tree might not be all that useful either: you could just use the array itself and do binary searches on it, instead of making a tree. The array IS just another form of a binary tree...
EDIT: there is a reason why a lot of computer scientists have spent a lot of time developing data structures and algorithms that perform well in certain situations. Rolling your own version of a balanced binary tree is unlikely to beat these...
Can you balance an unbalanced tree?
Yes, You can. You use the same balance function you created for your AVL Tree inside a PostOrderTraversal function.
Should You Do it?
No!!! You should recreate it! Balancing the tree will cost you unnecessarily.
How do I recreate it?
Use an InOrderTraversal function to put your nodes into an array. Then use a variable that will always go to the middle of the array and the left middle, right middle and add the nodes to the new Tree.
Is it possible to balance a normal binary tree? If yes, which algorithm should be used?
Do I necessarily have to use a AVL or Red-black tree to obtain a balanced tree? How do these work?
In general, Trees are either unbalanced or balanced. AVL, Red-Black, 2-3, e.t.c. are just trees with some properties and according to their properties they use some extra variables and functions. Those extra variables and function can also be used in the "normal" binary trees. In other words those functions and variables are not bounded to their respective type of tree. The nodes of a "normal" binary tree always had a balance! You just didn't use it because you didn't care if the "normal" binary tree was balanced or not. They also always had a height, depth, e.t.c. You just didn't care. In general, you will realize at one point that all are a trade-off between speed and memory. If you know what you are doing, more memory usage will make your program faster. Less memory usage means more calculations so you will have a slower program.
I came across solution given at http://discuss.joelonsoftware.com/default.asp?interview.11.780597.8 using Morris InOrder traversal using which we can find the median in O(n) time.
But is it possible to achieve the same using O(logn) time? The same has been asked here - http://www.careercup.com/question?id=192816
If you also maintain the count of the number of left and right descendants of a node, you can do it in O(logN) time, by doing a search for the median position. In fact, you can find the kth largest element in O(logn) time.
Of course, this assumes that the tree is balanced. Maintaining the count does not change the insert/delete complexity.
If the tree is not balanced, then you have Omega(n) worst case complexity.
See: Order Statistic Tree.
btw, BigO and Smallo are very different (your title says Smallo).
Unless you guarantee some sort of balanced tree, it's not possible.
Consider a tree that's completely degenerate -- e.g., every left pointer is NULL (nil, whatever), so each node only has a right child (i.e., for all practical purposes the "tree" is really a singly linked list).
In this case, just accessing the median node (at all) takes linear time -- even if you started out knowing that node N was the median, it would still take N steps to get to that node.
We can find the median by using the rabbit and the turtle pointer. The rabbit moves twice as fast as the turtle in the in-order traversal of the BST. This way when the rabbit reaches the end of traversal, the turtle in at the median of the BST.
Please see the full explanation.
I've tried to understand what sorted trees are and binary trees and avl and and and ...
I'm still not sure, what makes a sorted tree sorted? And what is the complexity (Big-Oh) between searching in a sorted and searching in an unsorted tree? Hope you can help me.
Binary Trees
There exists two main types of binary trees, balanced and unbalanced. A balanced tree aims to keep the height of the tree (height = the amount of nodes between the root and the furthest child) as even as possible. There are several types of algorithms for balanced trees, the two most famous being AVL- and RedBlack-trees. The complexity for insert/delete/search operations on both AVL and RedBlack trees is O(log n) or better - which is the important part. Other self balancing algorithms are AA-, Splay- and Scapegoat-tree.
Balanced trees gain their property (and name) of being balanced from the fact that after every delete or insert operation on the tree the algorithm introspects the tree to make sure it's still balanced, if it's not it will try to fix this (which is done differently with each algorithm) by rotating nodes around in the tree.
Normal (or unbalanced) binary trees do not modify their structure to keep themselves balanced and have the risk of, most often overtime, to become very inefficient (especially if the values are inserted in order). However if performance is of no issue and you mainly want a sorted data structure then they might do. The complexity for insert/delete/search operations on an unbalanced tree range from O(1) (best case - if you want the root) to O(n) (worst-case if you inserted all nodes in order and want the largest node)
There exists another variation which is called a randomized binary tree which uses some kind of randomization to make sure the tree doesn't become fully unbalanced (which is the same as a linked list)
A binary search tree is an "tree"-structure where every node has two children-nodes.
The left nodes all have the property of being less than its parent, and the right-nodes are all greater than its parent.
The intressting thing with an binary-tree is that we can search for an value in O(log n) when the tree is properly sorted. Doing the same search in an LinkedList for an example would give us the searchspeed of O(n).
The best way to go about learning datastructures would be to do a day of googling and reading wikipedia articles.
This might get you started
http://en.wikipedia.org/wiki/Binary_search_tree
Do a google search for the following:
site:stackoverflow.com binary trees
to get a list of SO questions which will answer your several questions.
There isn't really a lot of point in using a tree structure if it isn't sorted in some fashion - if you are planning on searching for a node in the tree and it is unsorted, you will have to traverse the entire tree (O(n)). If you have a tree which is sorted in some fashion, then it is only necessary to traverse down a single branch of the tree (typically O(log n)).
In binary tree the right leaf is always smaller then the head, and the left leaf is always bigger, so you can search in sorted tree in O(log(n)), you just need to go right if if the key is smaller than head and to the left if bgger