I am building a tree like data structure. What is the expected behavior if I have a method
public Set getSiblingNodes(Node node);
Should this method return a set including or excluding itself?
Thanks!
No. It should contain only it's siblings.
A node is not a sibling of itself.
(Why would you think otherwise?)
generally no,
But you can define some sort of tree, where siblings build the circular list and then if this list has one node it will be a sibling of itself.
Related
That is, return a boolean of whether or not an element was actually removed in the tree.
The common implementation is to call find() to see if the element is in your tree and return false if find() fails to find the target. This requires going down the tree twice: once for find() and once for remove().
One way is to set a private field flag that you will set upon finding the element during remove(). Seems kinda gross though. Anyone have any better ideas?
You can remove the elements in the BST in one traversal itself. There is no need of seperate traversal for finding the element in the removal process. To do remove the elements in the BST in one first traversal follow the steps:
Search for the node to be deleted.
If the node is found apply the removing algorithm
To clear understanding referenter link description here
This is not a homework question. I heard that it is possible to mirror a binary tree i.e. flip it, in constant time. Is this really the case?
Sure, depending on your data structure, you would just do the equivalent of: instead of traversing down the left node and then the right node, you would traverse down the right node, and then the left node. This could be a parameter passed into the recursive function that traverses the tree (i.e. in C/C++, a bool bDoLeftFirst, and an if-statement that uses that parameter to decide which order to traverse the child nodes in).
Did you mean "invert binary tree", the problem which Max Howell could not solve and thus rejected by Google?
https://leetcode.com/problems/invert-binary-tree/
You can find solutions in the "discuss" section.
I'm implementing a level order succint trie and I wan't to be able for a given node to jump back to his parent.
I tried several combination of rank/level but I can't wrap my head around this one...
I'm using this article as a base documentation :
http://stevehanov.ca/blog/index.php?id=120
It explain how to traverse childs, but not how to go up.
Thanks to this MIT lecture (http://www.youtube.com/watch?v=1MVVvNRMXoU) I know this is possible (in constant time as stated at 15:50), but the speaker only explain it for binary trie (eg: using the formula select1(floor(i/2)) ).
How can I do that on a k-ary trie?
Well, I don't know what select1() is, but the other part (floor(i/2)) looks like the trick you would use in an array-embedded binary tree, like those described here. You would divide by 2 because every parent has exactly 2 children --> every level uses twice the space of the parent level.
If you don't have the same number of children in every node (excepting leafs and perhaps one node with less children), you can't use this trick.
If you want to know the parent of any given node, you will need to add a pointer to the parent in every node.
Though, since trees are generally traversed starting at the root and going down, the usual thing to do is to store, in an array, the pointers to the nodes of the path. At any given point, the parent of the current node is the previous element in the array. This way you don't need to add a pointer to the parent in every node.
I think I've found my answer. This paper of Guy Jacobson explains it in section 3.2 Level-order unary degree sequence.
parent(x){ select1(rank0(x)) }
Space-efficient Static Trees and Graphs
http://www.cs.cmu.edu/afs/cs/project/aladdin/wwwlocal/compression/00063533.pdf
This work pretty good, as long as you don't mess up your node numbering like I was.
I have been reading about tree data structure to model a problem. I need to construct memory representation of a data which is very similar to folder/file representation in file system (I don't imply the actual file stored in disk but the explorer like structure). The tree may be maximum 10 deep The intermediate nodes may only have moderate number of children (say 10 ), but there could be thousands of leaf nodes.[that is like thousands of files in the folder and file is the leaf node]
Some thoughts
A Binary tree cannot work as one node can at the most have only 2
children. (say we can have 3 subfolders)
A very generic tree implementation may be inefficient as my data can be ordered. Like the left sibling is smaller/lesser than the right ones. I hope this allow to have efficient traversal.
A B-tree sounds very close, but does it insist balancing requirements. In my case, the depth won't be more than 10, but not necessarily all the branch that deep.(say c:/windows , C:/MyDoc../A/B/C)
Please help with your experience. Should I custom make a tree or any suitable data structure available (don't mean specific to a programming language)
You have two different kinds of nodes: files and folders.
A folder node contains a set (or map) of children, where the children may themselves be files or folders.
Alternatively, you might prefer for a folder node to contain a set of files and a set of folders.
For the sets, just use your favorite representation of ordered sets (probably the one that comes with whatever language you are using). Depending the exact details of your situation, you might prefer to use a map instead.
Use two separate data structures:
A binary search tree for search
And a general binary tree for representation
and link these two together.
Note:
In general tree put folders first in order and put all files in a BST as one last node.
Or Use:
Node:
Node* Left_Most_Child_Folder;
Node* Right_Sibling_Folder;
BST_Node* Files_Root;
In a typical file system, the "directory-tree" and the search tree are not the same thing, and are usually maintained separately. The "directory-tree", which tells you what files/sub-folders a folder has, or the path to a particular file, simply reflects how the user organizes the files and is only useful to the user. The search tree on the other hand maintains the global index of all files, so as to facilitate a fast search.
For example, you can implement a Linux like file system, where a folder is a file that records the pointers of the other files/folders it contains. At the same time you maintain a B+ tree, which has every file pointer as a leaf. The balance condition of the B+ tree has nothing to do with how the user organizes the folders.
One way to do this would be to use a binary tree of binary trees. For example:
Node
Node* Children;
Node* Left;
Note* Right;
And the root of your tree is a Node*.
This makes for easy traversal and quick insertion and removal of a node. Provided, of course, you know the path to the level where you want to insert the node, or the path to the node that you want to delete. But since you indicate that you want a model similar to Explorer, I assume that finding a particular level doesn't pose a problem.
Searching for a node at a particular level is as simple as searching a binary tree.
Without a little bit more information about what you're trying to model, that's the best I can do.
Here's a restatement of the rather cryptic title question:
Suppose we have a Prototype tree that has been built, that contains all the info on the structure of the tree and the generic description of each node. Now we want to create instances of this tree with elements that contain extra unique data. Let's call these Concrete trees.
The only difference between Concrete and Prototype trees is the extra data in the nodes of the Concrete tree. Supposing each node of a Concrete tree has a pointer/link to the corresponding element in the Prototype tree for generic information about the node, but no parent/child information of its own:
Is it possible to traverse the Concrete tree?
In particular, given a starting node in the Concrete tree, and a path through the Prototype tree, is it possible to efficiently get the corresponding node in the Concrete tree? There can be many Concrete trees, so a link back from Prototype tree is not possible.
Even though I might not need to optimize things to such an extent in my code, this is still an interesting problem!
Thanks in advance!
NOTE: There are no restrictions on the branching factor of the tree- a node can have between one and hundreds of children.
Extra ramblings/ideas:
The reason I ask, is that it seems like it would be a waste to copy parent/child information each time a new instance of a Concrete tree is created, since this structure is identical to the Prototype tree. In my particular case, children are identified by string names, so I have to store a string-to-pointer hash at each node. There can be many instances of Concrete trees, and duplicating this hash seems like a huge waste of space.
As a first idea, perhaps the path could be somehow hashed into an int or something that compactly identifies an element (not a string, since that's too big), which is then used to look up concrete elements in hashes for each Concrete tree?
Once created, will the prototype tree ever change (i.e. will nodes ever be inserted or removed)?
If not, you could consider array-backed trees (i.e. child/parent links are represented by array indices, not raw pointers), and use consistent indexing for your concrete trees. That way, it's trivial to map from concrete to prototype, and vice versa.
You could have a concrete leaf for each prototype node, but you'd need to do some kind of hashing per tree (as you suggest) to keep different concrete trees separate. At this point you've incurred the same storage cost as a completely separate tree with redundant child/parent pointers. You definitely want a link from the prototype tree to the concrete trees.
I can see this approach being useful if you want to make structural changes to the prototype tree affect all linked concrete trees. Shuffling nodes would instantly affect all concrete trees. You may incur extra cost since it will be impossible to transmit a single concrete tree without either sending every concrete tree or doing some extract operation to rip one tree out.
In general you will not be able to encode a path uniquely in an int.
Just store the parent child relationship in the concrete tree and forget about it. At best it's a single pointer value, worst it's two pointer values. You would need at least that much to keep links between the prototype tree and the concrete tree anyway.
Its possible when there's a known dependency between addresses of nodes in
both trees. Basically it means that nodes have to be fixed-size and allocated
all at once.
Sure, its also possible to use a hashtable for mapping of addresses of first tree
nodes to second tree nodes, but such a hashtable has to have at least 10x more nodes
than first tree, otherwise mapping would be too slow.
#include <stdio.h>
typedef unsigned char byte;
struct Node1 {
Node1* child[2];
Node1() { child[0]=child[1]=0; }
};
struct Node2 {
int N;
Node2() { N=0; }
};
int main( void ) {
int i,j,k,N = 256;
Node1* p = new Node1[2*N];
Node2* q = new Node2[2*N];
// insert
for( i=0,k=1; i<N; i++ ) {
Node1* root = &p[0];
Node1** r = &root;
for( j=7;; j-- ) {
if( r[0]==0 ) r[0]=&p[k++];
if( j<0 ) break;
r = &r[0]->child[(i>>j)&1];
}
q[r[0]-p].N = byte(i+123);
// ^^^^^ - mapping from p[] to q[]
}
// check
for( i=N-1; i>=0; i-- ) {
Node1* r = &p[0];
for( j=7; j>=0; j-- ) r = r->child[(i>>j)&1];
if( q[r-p].N != byte(i+123) ) printf( "error!\n" );
}
}
I think you can do what you describe, but I don't believe it constitutes an optimisation (for the type of reasons referred to by #Dave). The key to doing so lies in tying the pointers back to the prototype in such a way that they also act as identifiers. In addition major traversals through the prototype tree would need to be pre-calculated - a breadth first and a depth first traversal.
The pre-calculated traversals are likely to use a stack or queue, depending on the particular traversal. In addition, as the traversals are done, an indexed linked list needs to be built in the traversal order (or as #Oli suggests an indexed array). The data in the linked list is the identifier (see following) of the node. Each prototype tree and each prototype node needs an identifier (could be an address, or an arbitary identifier). Each concrete tree has its own identifier. Each concrete node is given the SAME identifier as its corresponding node in the prototype tree. Then to follow a partial traversal you identify the node identifier in the linked list and use this as the identifier of the concrete node.
In essence you are creating a link between the prototype and the concrete nodes, by using the equivalence of the identifiers as the pointer (a sort of "ghost" pointer). It does require a number of supporting mechanisms, and these are likely to cause this route not to be an actual optimisation.