How to implement a Multiset with a RedBlack Tree? - algorithm

I need some general background here, and I can't find it online..
My main doubt is, if I want to implement a Multiset structure with a redblack tree, do I have to put in the RB Tree every element of the Multiset (every repeated element also..) or is there a way to save the unique elements and their multiplicity?
All this should be done only with one redblack tree, no other structures.
(This is for a homework as you may have guessed..)

Just store the number of instances (>0) in each leaf.

Related

Is there a balanced BST with each node maintain the subtree size?

Is there a balanced BST structure that also keeps track of subtree size in each node?
In Java, TreeMap is a red-black tree, but doesn't provide subtree size in each node.
Previously, I did write some BST that could keep track subtree size of each node, but it's not balanced.
The questions are:
Is it possible to implement such a tree, while keeping efficiency of (O(lg(n)) for basic operations)?
If yes, then is there any 3rd-party libraries provide such an impl?
A Java impl is great, but other languages (e.g c, go) would also be helpful.
BTW:
The subtree size should be kept track in each node.
So that could get the size without traversing the subtree.
Possible appliation:
Keep track of rank of items, whose value (that the rank depends on) might change on fly.
The Weight Balanced Tree (also called the Adams Tree, or Bounded Balance tree) keeps the subtree size in each node.
This also makes it possible to find the Nth element, from the start or end, in log(n) time.
My implementation in Nim is on github. It has properties:
Generic (parameterized) key,value map
Insert (add), lookup (get), and delete (del) in O(log(N)) time
Key-ordered iterators (inorder and revorder)
Lookup by relative position from beginning or end (getNth) in O(log(N)) time
Get the position (rank) by key in O(log(N)) time
Efficient set operations using tree keys
Map extensions to set operations with optional value merge control for duplicates
There are also implementations in Scheme and Haskell available.
That's called an "order statistic tree": https://en.wikipedia.org/wiki/Order_statistic_tree
It's pretty easy to add the size to any kind of balanced binary tree (red-black, avl, b-tree, etc.), or you can use a balancing algorithm that works with the size directly, like weight-balanced trees (#DougCurrie answer) or (better) size-balanced trees: https://cs.wmich.edu/gupta/teaching/cs4310/lectureNotes_cs4310/Size%20Balanced%20Tree%20-%20PEGWiki%20sourceMayNotBeFullyAuthentic%20but%20description%20ok.pdf
Unfortunately, I don't think there are any standard-library implementations, but you can find open source if you look for it. You may want to roll your own.

Threaded Binary Search Trees Advantage

An explanation about Threaded Binary Search Trees (skip it if you know them):
We know that in a binary search tree with n nodes, there are n+1 left and right pointers that contain null. In order to use that memory that contain null, we change the binary tree as follows -
for every node z in the tree:
if left[z] = NULL, we put in left[z] the value of tree-predecessor(z) (i.e, a pointer to the node which contains the predecessor key),
if right[z] = NULL, we put in right[z] the value of tree-successor(z) (again, this is a pointer to the node which contains the successor key).
A tree like that is called a threaded binary search tree, and the new links are called threads.
And my question is:
What is the main advatage of Threaded Binary Search Trees (in comparison to "Regular" binary search trees).
A quick search in the web has told me that it helps to implement in-order traversal iteratively, and not recursively.
Is that the only difference? Is there another way we can use the threads?
Is that so meaningful advantage? and if so, why?
Recursive traversal costs O(n) time too, so..
Thank you very much.
Non-recursive in-order scan is a huge advantage. Imagine that somebody asks you to find the value "5" and the four values that follow it. That's difficult using recursion. But if you have a threaded tree then it's easy: do the recursive in-order search to find the value "5", and then follow the threaded links to get the next four values.
Similarly, what if you want the four values that precede a particular value? That's difficult with a recursive traversal, but trivial if you find the item and then walk the threaded links backwards.
The main advantage of Threaded Binary Search Trees over Regular one is in Traversing nature which is more efficient in case of first one as compared to other one.
Recursively traversing means you don't need to implement it with stack or queue .Each node will have pointer which will give inorder successor and predecessor in more efficient way , while implementing traversing in normal BST need stack which is memory exhaustive (as here programming language have to consider implementation of stack) .

Tree with Multiple Children - Linear Time Access Complexity

My problem is not with the structure to hold the Tree but the way I am doing it; because I think this implementation will cost much in long run.
I have a tree structure in which a tree Node will contain the List of references of its children. But here the problem is that while finding the child of a Node, we need to go through the List of children which will take Linear time(Linear time complexity). And I also need to store these all as immediate child(as children word is used for the immediate children).
Now, is there any way I can put all the children other than List structure so that the retrieval and deletion of the children from the List will be efficient and logarithmic(if we can)?
If I am going to traverse the Tree then to go to the right children from the root node, I will have to check a condition for each child node. That check would be Linear search and check.
I just want a technique which will help in improving this algorithm of searching for the right child in the children list during traversal.
Instead of just having each node keep a regular list, have it either keep a sorted list(log n lookup) or a hashmap(constant time lookup). In this case sorting is probably the best so you can easily iterate over the elements and save space.

Trees and priority queues

I'm trying to solve an exercise which results to be a little bit difficult since I have to implement a priority queue starting from a template class of a tree (kind of RedBlack or BinarySearch Tree).
Using the template which looks like
class Node
int key
Node left
Node right
Node parent
int leftNodes
int rightNodes
Initially, when I had to insert a new element I tried to fill completely a level of the tree and then using an InOrderTreeTRaversal/Sort algorithm filling an array and generating a BinarySearch tree from that array and replacing with the new root element the original one. Supposing to have as a result a balanced tree.
Unfortunately this approach appears inappropriate since the tree must emulate the maxheap property maintaining balanced the tree for every insertion/deletion (and my code didn't work well in filling completely a tree level). It is possible implements a Tree with Heap capabilities? I mean a tree for which each element is bigger or equal its children, remains balanced after insertion and has autobalancing capabilities when the root node (the bigger key element) is deleted?
You probably want to implement a binary heap, see http://en.wikipedia.org/wiki/Binary_heap
IIRC one of the main advantages of this data structure is that it can be embedded in an array (because of the balanced nature). Heapsort uses this kind of data structure to sort in-place.

Binary Tree's usage

Can someone give me a real life example ( in programming, C#) of needing to use a Binary Tree or even just an ordinary tree?
I understand the principle of a Binary Tree and how they work, but I'm trying to find some real life example's of their usage?
Tony
In C#, Java, Python, C++ (using the STL) and other high-level languages, most of the time you will use one of the built-in/library-included types to store your data, at least the data you work on at the moment, so most of the time you won't be using a binary tree or another kind of tree explicitly.
This being said, some of these built-in types are implemented as trees of one kind or another "in the backstage", and in some situations you will have to implement one yourself.
Also, a related thing you HAVE to know is binary search. This is mostly done in binary trees (binary search trees :P) but the idea can be extrapolated to a lot of problems, even without trees involved, so try understand it well.
Edit: Real life classical example:
Imagine that you want to search for the phone number of a particular person in the phone guide of a big city. All things being equal, you will open it roughly at the middle, look for the guys in that page, and see if your "target" is before or after it, thus cutting the data by half. Then you repeat the operation in the half where you know your "target" is, and again and again until you found your "target". As each time you are looking into half the data you had before, you require a total of log(base 2) n operations to reach your "target", where n is the total size of the data.
So in a 1 million phone book, you find your target in log(base 2) 1 million = 20 comparisons, instead of comparing one by one as in a linear search (that's 1 million comparisons in the worst case).
Note that this only work in already sorted data.
Balanced binary trees, storing data maintained in sorted order, are used to achieve O(log(n)) lookup, delete, and insert times. "Balanced" just means there is a bounded limit between the depth of the shallowest and deepest leaves, counting empty left/right nodes as leaves. (optimally the depth of left and right subtrees differs at most by one, some implementations relax this to make the algorithms simpler)
You can use an array, rather than a tree, in sorted order with binary search to achieve O(log(n)) lookup time, but then the insert/delete times are O(n).
Some trees (notably B-trees for databases) use more than 2 branches per node, to widen the tree and reduce the maximum depth (which determines search times).
I can't think of a reason to use binary trees that are not maintained in sorted order (a point that has not been mentioned in most of the answers here), but maybe there's some application for this. Besides the sorted binary balanced tree, anything with hierarchy (as other answerers have mentioned, XML or directory structures) is a good application for trees, whether binary or not.
edit: re: unsorted binary trees: I just remembered that LISP and Scheme both make heavy use of unbalanced binary trees. The cons function takes two arguments (e.g. (define c (cons a b)) ) and returns a tree node whose branches are the two arguments. The car function takes such a tree node and returns the first argument given to cons. The cdr function is similar but returns the second argument to cons. Finally nil represents a null object. These are the primitives used to make all data structures in LISP and Scheme. Lists are implemented using an extreme unbalanced binary tree. The list containing literal elements 'Alabama, 'Alaska, 'Arizona, and 'Arkansas can be constructed explicitly as
(cons 'Alabama (cons 'Alaska (cons 'Arizona (cons 'Arkansas nil))))
and can be traversed using car and cdr (where car is used to get the head of the list and cdr is used to get the sublist excluding the list head). This is how Scheme works, I think LISP is the same or very similar. More complicated data structures, like binary trees (which need 3 members per node: two to hold the left and right nodes, and a third to hold the node value) or trees containing more than two branches per node can be constructed using a list to implement each node.
How about the directory structure in Unix. For instance the du command i.e. the disk usage command does a post order traversal (traversal order:: left child -> right child -> root node) of a tree representing the directory structure in order to fetch the disk space used by that directory.
The following slides should help.
http://www.cse.unt.edu/~rada/CSCE3110/Lectures/Trees.ppt
cheers
In Java, trees are used to implement certain sorted data structures, such as the TreeSet:
http://java.sun.com/j2se/1.5.0/docs/api/java/util/TreeSet.html
They are used for data structures where you want the order to be based on some property of the elements, rather than on insertion order.
Here are some examples:
The in-memory representation of a parsed program or expression is a tree. In the case of expressions (excluding ternary operators) the tree will be binary.
The components of a GUI are organized as a tree.
Any "containment" hierarchy can be represented as a tree. (HTML, XML and SGML are examples.
And of course, binary (and n-ary) trees can be used to represent indexes, maps, sets and other "generic" data structures.
An easy example is searching. If you store your list data in a tree, for example, you get O(log(n)) lookup times. A standard array implementation of a list would achieve O(n) lookup time.
XML, HTML (and SGML) documents are trees.

Resources