Want to save binary tree to disk for "20 questions" game - data-structures

In short, I'd like to learn/develop an elegant method to save a binary tree to disk (a general tree, not necessarily a BST). Here is the description of my problem:
I'm implementing a game of "20-questions". I've written a binary tree whose internal nodes are questions and leaves are answers. The left child of a node is the path you'd follow if somebody answered "yes" to your current question, while the right child is a "no" answer. Note this is not a binary search tree, just a binary tree whose left child is "yes" and right is "no".
The program adds a node to a tree if it encounters a leaf that is null by asking the user to distinguish her answer from the one the computer was thinking of.
This is neat, because the tree builds itself up as the user plays. What's not neat is that I don't have a good way of saving the tree to disk.
I've thought about saving the tree as an array representation (for node i, left child is 2i+1, and 2i+2 right, (i-1)/2 for parent), but it's not clean and I end up with a lot of wasted space.
Any ideas for an elegant solution to saving a sparse binary tree to disk?

You can store it recursively:
void encodeState(OutputStream out,Node n) {
if(n==null) {
out.write("[null]");
} else {
out.write("{");
out.write(n.nodeDetails());
encodeState(out, n.yesNode());
encodeState(out, n.noNode());
out.write("}");
}
}
Devise your own less texty output format. I'm sure I don't need to describe the method to read the resulting output.
This is depth-first traversal. Breadth-first works too.

I would do a Level-order traversal. That is to say you are basically doing a Breadth-first search algorithm.
You have:
Create a qeueue with the root element inserted into it
Dequeue an element from the queue, call it E
Add the left and right children of E into the queue. If there is no left or right, just put a null node representation.
write node E to disk.
Repeat from step 2.
Level-order traversal sequence: F, B, G, A, D, I, C, E, H
What you will store on disk: F, B, G, A, D, NullNode, I, NullNode, NullNode, C, E, H, NullNode
Loading it back from disk is even easier. Simply read from left to right the nodes you stored to disk. This will give you each level's left and right nodes. I.e. the tree will fill in from top to bottom left to right.
Step 1 reading in:
F
Step 2 reading in:
F
B
Step 3 reading in:
F
B G
Step 4 reading in:
F
B G
A
And so on ...
Note: Once you have a NULL node representation, you no longer need to list its children to disk. When loading back you will know to skip to the next node. So for very deep trees, this solution will still be efficient.

A simple way to accomplish this is to traverse the tree outputting each element as you do so. Then to load the tree back, simply iterate through your list, inserting each element back into the tree. If your tree isn't self balancing, you may want to reorder the list in such a way that the final tree is reasonably balanced.

Not sure it's elegant, but it's simple and explainable:
Assign a unique ID to each node, whether stem or leaf. A simple counting integer will do.
When saving to disk, traverse the tree, storing each node ID, "yes" link ID, "no" link ID, and the text of the question or answer. For null links, use zero as the null value. You could either add a flag to indicate whether question or answer, or more simply, check whether both links are null. You should get something like this:
1,2,3,"Does it have wings?"
2,0,0,"a bird"
3,4,0,"Does it purr?"
4,0,0,"a cat"
Note that if you use the sequential integers approach, saving the node's ID may be redundant, as shown here. You could just put them in order by ID.
To restore from disk, read a line, then add it to the tree. You will probably need a table or array to hold forward-referenced nodes, e.g. when processing node 1, you'll need to keep track of 2 and 3 until you can fill in those values.

The most arbitrary simple way is just a basic format that can be used to represent any graph.
<parent>,<relation>,<child>
Ie:
"Is it Red", "yes", "does it have wings"
"Is it Red", "no" , "does it swim"
There isn't much redundancy here, and the formats mostly human readable, the only data duplication is that there must be a copy of a parent for every direct child it has.
The only thing you really have to watch is that you don't accidentally generate a cycle ;)
Unless that's what you want.
The problem here is rebuilding the
tree afterwards. If I create the "does
it have wings" object upon reading the
first line, I have to somehow locate
it when I later encounter the line
reading "does it have
wings","yes","Has it got a beak?"
This is why I traditionally just use graph structures in memory for such a thing with pointers going everywhere.
[0x1111111 "Is It Red" => [ 'yes' => 0xF752347 , 'no' => 0xFF6F664 ],
0xF752347 "does it have wings" => [ 'yes' => 0xFFFFFFF , 'no' => 0x2222222 ],
0xFF6F664 "does it swim" => [ 'yes' => "I Dont KNOW :( " , ... etc etc ]
Then the "child/parent" connectivity is merely metadata.

In java if you were to make a class serializeable you can just write the class object to disc and read it back using input/output streams.

I would store the tree like this:
<node identifier>
node data
[<yes child identfier>
yes child]
[<no child identifier>
no child]
<end of node identifier>
where the child nodes are just recursive instances of the above. The bits in [] are optional and the four identifiers are just constants/enum values.

Here is the C++ code using PreOrder DFS:
void SaveBinaryTreeToStream(TreeNode* root, ostringstream& oss)
{
if (!root)
{
oss << '#';
return;
}
oss << root->data;
SaveBinaryTreeToStream(root->left, oss);
SaveBinaryTreeToStream(root->right, oss);
}
TreeNode* LoadBinaryTreeFromStream(istringstream& iss)
{
if (iss.eof())
return NULL;
char c;
if ('#' == (c = iss.get()))
return NULL;
TreeNode* root = new TreeNode(c, NULL, NULL);
root->left = LoadBinaryTreeFromStream(iss);
root->right = LoadBinaryTreeFromStream(iss);
return root;
}
In main(), you can do:
ostringstream oss;
root = MakeCharTree();
PrintVTree(root);
SaveBinaryTreeToStream(root, oss);
ClearTree(root);
cout << oss.str() << endl;
istringstream iss(oss.str());
cout << iss.str() << endl;
root = LoadBinaryTreeFromStream(iss);
PrintVTree(root);
ClearTree(root);
/* Output:
A
B C
D E F
G H I
ABD#G###CEH##I##F##
ABD#G###CEH##I##F##
A
B C
D E F
G H I
*/
The DFS is easier to understand.
*********************************************************************************
But we can use level scan BFS using a queue
ostringstream SaveBinaryTreeToStream_BFS(TreeNode* root)
{
ostringstream oss;
if (!root)
return oss;
queue<TreeNode*> q;
q.push(root);
while (!q.empty())
{
TreeNode* tn = q.front(); q.pop();
if (tn)
{
q.push(tn->left);
q.push(tn->right);
oss << tn->data;
}
else
{
oss << '#';
}
}
return oss;
}
TreeNode* LoadBinaryTreeFromStream_BFS(istringstream& iss)
{
if (iss.eof())
return NULL;
TreeNode* root = new TreeNode(iss.get(), NULL, NULL);
queue<TreeNode*> q; q.push(root); // The parents from upper level
while (!iss.eof() && !q.empty())
{
TreeNode* tn = q.front(); q.pop();
char c = iss.get();
if ('#' == c)
tn->left = NULL;
else
q.push(tn->left = new TreeNode(c, NULL, NULL));
c = iss.get();
if ('#' == c)
tn->right = NULL;
else
q.push(tn->right = new TreeNode(c, NULL, NULL));
}
return root;
}
In main(), you can do:
root = MakeCharTree();
PrintVTree(root);
ostringstream oss = SaveBinaryTreeToStream_BFS(root);
ClearTree(root);
cout << oss.str() << endl;
istringstream iss(oss.str());
cout << iss.str() << endl;
root = LoadBinaryTreeFromStream_BFS(iss);
PrintVTree(root);
ClearTree(root);
/* Output:
A
B C
D E F
G H I
ABCD#EF#GHI########
ABCD#EF#GHI########
A
B C
D E F
G H I
*/

Related

In-order traversal of unthreaded binary search tree, without a stack

Suppose that we somehow get a dump of an alien computer's memory. Unfortunately, said alien civilization had much large RAM sizes than we have, and somehow loved functional languages so much that all the data is in a gigantic, unthreaded, binary search tree (with no parent pointers either). Hey, aliens have lots of stack for recursion!
Now this binary tree is in a gigantic 100 terabyte disk array. We need some way of getting an in-order traversal. There is the recursive way, which uses up stack, and no computer has 100 terabytes of stack, and also the "iterative" way, which is really manually maintaining a stack.
We are allowed to modify the tree, but only with additional pointer fields and integer fields at the nodes. This is because the 100 terabytes disk array is almost completely full. We definitely can't do something like use another 100 terabytes as mmap'ed stack or something.
How can this impossible task be completed? The really infuriating part is that, hey, the tree is sitting there, perfectly ordered, inside the disk array, but we can't seem to get it out in order.
You can traverse the tree, by temporarily linking the rightmost child in each tree to its successor in the inorder traversal. For example, when you first come to t:
t
/ ^ \
a | d
/\ |
b c-+
you link the rightmost element of the left child of t back to t via the originally null
right child pointer. Later on when following right pointers, you come back to t and try to repeat the same procedure, but this time arriving back to t. In this case you restore the pointer to null, traverse t and continue traversing the right child of t.
This is essentially exercise 21 in "The Art of Computer Programming", Vol.1, section 2.3.1.
struct tree {
tree (int v) : value (v), left (nullptr), right (nullptr) {}
int value;
tree *left, *right;
};
template<typename F>
void
inorder (tree *t, F visit) {
while (t != nullptr) {
if (t->left == nullptr) {
visit (t);
t = t->right;
} else {
tree *q, *r;
r = t;
q = t = t->left;
while (q->right && q->right != r)
q = q->right;
if (q->right == nullptr)
q->right = r;
else {
q->right = nullptr;
visit (r);
t = r->right;
}
}
}
}

Algorithm for evaluating nested logical expression

I have a logical expression that I would like to evaluate.
The expression can be nested and consists of T (True) or F (False) and parenthesis.
The parenthesis "(" means "logical OR".
Two terms TF beside each others (or any other two combinations beside each others), should be ANDED (Logical AND).
For example, the expression:
((TFT)T) = true
I need an algorithm for solving this problem. I thought of converting the expression first to disjunctive or conjunctive normal form and then I can easily evaluate the expression. However, I couldn't find an algorithm that normalizes the expression. Any suggestions? Thank you.
The problem statement can be found here:
https://icpcarchive.ecs.baylor.edu/index.php?option=com_onlinejudge&Itemid=2&category=378&page=show_problem&problem=2967
Edit: I misunderstood part of the problem. In the given logical expression, the AND/OR operators alternate with every parenthesis "(". If we are to represent the expression by a tree, then the AND/OR operators depend on the the sub-tree's depth-level. However, it's initially given that the trees at the deepest level are AND-trees. My task is to evaluate the given expression possibly by constructing the tree.
Thanks for the answers below which clarified the correct requirement of the problem.
Scan the string from left to right. Every time you see a left parenthesis, add a new entry to a stack structure. When you see a right parenthesis, pop the top-most entry on the stack, evaluate it to T or F, pop the stack again, and append the computed value to the popped term. Continue until the end of the string, at which point you will have a string of T and F, and you evaluate it.
To evaluate a string of Ts and Fs, return T if all are T, and F otherwise. So we have...
evaluate(String expression)
1. subexpr = ""
2. for i := 1 to n do
3. if expression[i] == "(" then
4. stack.push(subexpr)
5. subexpr = ""
6. else if expression[i] == ")" then
7. result = evaluateSimple(subexpr)
8. subexpr = stack.pop() + result
9. else subexpr += expression[i]
10. return evaluate2(subexpr)
evaluate2(String expression)
1. for i := 1 to n do
2. if expression[i] == "F" then return "F"
3. return "T"
Or something like that should do it (EDIT: in fact, this does not correctly answer the question, even as asked; see the comments. Leaving this alone since it still gets one going in the right direction). Note that you could just have one function, evaluate, that does what evaluate2 does, but after the first loop, and only to subexpr. This avoids going through the unnecessary copy that would entail, but you'd have less code the other way.
After having looked at the original problem, I think you have misunderstood it.
This question is about an AND/OR tree where the nodes at the deepest level are AND nodes. The logical operatives at the other nodes are determined by this factor - we do not know if they are AND or OR nodes initially, we're only given that the nodes at the deepest level are AND nodes - so the nodes at the next higher level are OR nodes, and the next higher level are AND nodes, and so and so on... the logical operatives interchange between different depths of the tree. This will become clear if you look at the sample AND/OR tree they have provided.
The way I'd approach this problem is to first figure out the logical connective for the root node. This can be done with a single scan over the expression and keeping track of the number of parentheses. Note that each () corresponds to a new node in the tree (the next level of the tree). For an example, consider the expression:
((F(TF))(TF))
When you walk across this expression, first we encounter 3 opening parentheses, 2 closing, 1 opening and then finally 2 closing. If you take the maximum number of parentheses that were open at any given time during this walk, it'll be the maximum depth of this AND/OR tree (3 in the above example).
So what does this mean? If the depth of the tree is odd, then the root node is an AND node, otherwise the root is an OR node (because the connectives alternate).
Once you know the connective of the root node, you can evaluate this expression using a simple stack based machine. We need to keep in mind that every time we open or close a parentheses, we need to flip the connective. Here's how the above expression gets evaluated:
AND |- (•(F(TF))(TF))
Notice that the bullet indicates where we are at the expression (like top of the stack). Then we proceed like below:
OR |- ((•F(TF))(TF)) // flipped the connective because we jumped a node
OR |- ((F•(TF))(TF)) // nothing to evaluate on the current node, push F
AND |- ((F(•TF))(TF))
AND |- ((F(T•F))(TF))
AND |- ((F(TF•))(TF))
AND |- ((F(F•))(TF)) // Two booleans on top, T AND F = F (reduce)
OR |- ((F(F)•)(TF)) // Jumped out of a node, flip the sign
OR |- ((FF•)(TF)) // Completely evaluated node on top, (F) = F (reduce)
OR |- ((F•)(TF)) // Two booleans on top, F OR F = F (reduce)
AND |- ((F)•(TF))
AND |- (F•(TF))
OR |- (F(•TF))
OR |- (F(T•F))
OR |- (F(TF•))
OR |- (F(T•))
AND |- (F(T)•)
AND |- (FT•)
AND |- (F•)
So you get the final answer as F. This has some relation to shift-reduce parsing but the reductions in this case depend on the current depth of the AST we're operating at. I hope you'll be able to translate this idea into code (you'll need a stack and a global variable for keeping track of the current logical operative in force).
Finally, thank you for introducing that site. You might also like this site.
From reading the problem description at the site you linked to, I think you may have misunderstood the problem. Whether you need to "logical AND" or "logical OR" the terms depends on how many levels down you are from the root node.
You can easily solve this problem by parsing the expression into a syntax tree, and then walking the tree recursively, evaluating each sub-expression until you get back up to the root node.
I solved this problem using a different technique than the ones mentioned. And I got it Accepted by the online system judge.
After figuring out the operator at the first level of the tree (Thanks to #Asiri Rathnayake for his idea), I recursively construct the expression tree. During the construction, I scan the string. If the character is '(', then I create a node with the current operator value and add it to the tree. Then, I alternate the operator and go for a deeper recursion level. If the character is 'T', then I create a node with value "True", add it to the tree and continue scanning. If the character is 'F', then I create a node with the value "False", add it to the tree and continue scanning. Finally, if the character is ')', then I return to one level up of the recursion.
At the end, I will have the expression tree completed. Now, all I need to do is a simple evaluation for the tree using basic recursive function.
Below is my C++ code:
#include<iostream>
#include<string>
#include<vector>
#include<algorithm>
using namespace std;
struct Node {
char value;
vector<Node*> children;
};
void ConstructTree (int &index, string X, Node *&node, int op)
{
for(; index<X.size(); index++)
{
if(X[index]=='T')
{
Node *C= new Node;
C->value='T';
node->children.push_back(C);
}
else if(X[index]=='F')
{
Node* C= new Node;
C->value='F';
node->children.push_back(C);
}
else if(X[index]=='(')
{
if(op==0)
{
Node* C= new Node;
C->value='O';
node->children.push_back(C);
}
else
{
Node* C= new Node;
C->value='A';
node->children.push_back(C);
}
index++;
ConstructTree(index,X,node->children[node->children.size()-1],1-op);
}
else
return;
}
}
bool evaluateTree(Node* node)
{
if(node->value=='T')
return true;
else if(node->value=='F')
return false;
else if(node->value=='O')
{
for(int i=0; i<node->children.size(); i++)
if(evaluateTree(node->children[i])==true)
return true;
return false;
}
else if(node->value=='A')
{
for(int i=0; i<node->children.size(); i++)
if(evaluateTree(node->children[i])==false)
return false;
return true;
}
}
int main()
{
string X;
int testCase=1;
while(cin>>X)
{
if(X=="()")
break;
int index=0;
int op=-1;
int P=0;
int max=0;
for(int i=0; i<X.size(); i++)
{
if(X[i]=='(')
P++;
if(X[i]==')')
P--;
if(P>max)
max=P;
}
if(max%2==0)
op=0; //OR
else
op=1; //AND
Node* root = new Node;
if(op==0)
root->value='O';
else
root->value='A';
index++;
ConstructTree(index,X,root,1-op);
if(evaluateTree(root))
cout<<testCase<<". true"<<endl;
else
cout<<testCase<<". false"<<endl;
testCase++;
}
}

binary tree with special property

There is binary tree with special property that all its inner node have val = 'N' and all leaves have val = 'L'. Given its preorder. construct the tree and return the root node.
every node can either have two children or no child
Recursion is your friend.
Tree TreeFromPreOrder(Stream t) {
switch (t.GetNext()) {
case Leaf: return new LeafNode;
case InternalNode:
Node n = new Node;
n.Left = TreeFromPreOrder(t);
n.Right = TreeFromPreOrder(t);
return n;
default:
throw BadPreOrderException;
}
}
Looking at it as a recursive method, it becomes easy to see how do other things.
For instance, say we wanted to print the InOrder traversal. The code will look something like this:
void PrintInorderFromPreOrder(Stream t) {
Node n = new Node(t.GetNext());
switch (n.Type) {
case Leaf: return;
case InternalNode:
PrintInorderFromPreOrder(t);
print(n.Value);
PrintInorderFromPreOrder(t);
default:
throw BadPreOrderException;
}
}
Also, I would like to mention that this is not that artificial. This type of representation can actually be used to save space when we need to serialize a binary tree: Efficient Array Storage for Binary Tree.
Just the basic idea: keep a stack where the head is the "current" node and read sequentially the string representing the preorder.
Now, if you encounter a 'L', then it means the "current" node has as a child a leaf, so you can "switch" to the right child and resuming building the corresponding subtree, pushing the root of that subtree; if, when encountering a 'L', the "current node" has already two children, pop an element from the stack.

Create Balanced Binary Search Tree from Sorted linked list

What's the best way to create a balanced binary search tree from a sorted singly linked list?
How about creating nodes bottom-up?
This solution's time complexity is O(N). Detailed explanation in my blog post:
http://www.leetcode.com/2010/11/convert-sorted-list-to-balanced-binary.html
Two traversal of the linked list is all we need. First traversal to get the length of the list (which is then passed in as the parameter n into the function), then create nodes by the list's order.
BinaryTree* sortedListToBST(ListNode *& list, int start, int end) {
if (start > end) return NULL;
// same as (start+end)/2, avoids overflow
int mid = start + (end - start) / 2;
BinaryTree *leftChild = sortedListToBST(list, start, mid-1);
BinaryTree *parent = new BinaryTree(list->data);
parent->left = leftChild;
list = list->next;
parent->right = sortedListToBST(list, mid+1, end);
return parent;
}
BinaryTree* sortedListToBST(ListNode *head, int n) {
return sortedListToBST(head, 0, n-1);
}
You can't do better than linear time, since you have to at least read all the elements of the list, so you might as well copy the list into an array (linear time) and then construct the tree efficiently in the usual way, i.e. if you had the list [9,12,18,23,24,51,84], then you'd start by making 23 the root, with children 12 and 51, then 9 and 18 become children of 12, and 24 and 84 become children of 51. Overall, should be O(n) if you do it right.
The actual algorithm, for what it's worth, is "take the middle element of the list as the root, and recursively build BSTs for the sub-lists to the left and right of the middle element and attach them below the root".
Best isn't only about asynmptopic run time. The sorted linked list has all the information needed to create the binary tree directly, and I think this is probably what they are looking for
Note that the first and third entries become children of the second, then the fourth node has chidren of the second and sixth (which has children the fifth and seventh) and so on...
in psuedo code
read three elements, make a node from them, mark as level 1, push on stack
loop
read three elemeents and make a node of them
mark as level 1
push on stack
loop while top two enties on stack have same level (n)
make node of top two entries, mark as level n + 1, push on stack
while elements remain in list
(with a bit of adjustment for when there's less than three elements left or an unbalanced tree at any point)
EDIT:
At any point, there is a left node of height N on the stack. Next step is to read one element, then read and construct another node of height N on the stack. To construct a node of height N, make and push a node of height N -1 on the stack, then read an element, make another node of height N-1 on the stack -- which is a recursive call.
Actually, this means the algorithm (even as modified) won't produce a balanced tree. If there are 2N+1 nodes, it will produce a tree with 2N-1 values on the left, and 1 on the right.
So I think #sgolodetz's answer is better, unless I can think of a way of rebalancing the tree as it's built.
Trick question!
The best way is to use the STL, and advantage yourself of the fact that the sorted associative container ADT, of which set is an implementation, demands insertion of sorted ranges have amortized linear time. Any passable set of core data structures for any language should offer a similar guarantee. For a real answer, see the quite clever solutions others have provided.
What's that? I should offer something useful?
Hum...
How about this?
The smallest possible meaningful tree in a balanced binary tree is 3 nodes.
A parent, and two children. The very first instance of such a tree is the first three elements. Child-parent-Child. Let's now imagine this as a single node. Okay, well, we no longer have a tree. But we know that the shape we want is Child-parent-Child.
Done for a moment with our imaginings, we want to keep a pointer to the parent in that initial triumvirate. But it's singly linked!
We'll want to have four pointers, which I'll call A, B, C, and D. So, we move A to 1, set B equal to A and advance it one. Set C equal to B, and advance it two. The node under B already points to its right-child-to-be. We build our initial tree. We leave B at the parent of Tree one. C is sitting at the node that will have our two minimal trees as children. Set A equal to C, and advance it one. Set D equal to A, and advance it one. We can now build our next minimal tree. D points to the root of that tree, B points to the root of the other, and C points to the... the new root from which we will hang our two minimal trees.
How about some pictures?
[A][B][-][C]
With our image of a minimal tree as a node...
[B = Tree][C][A][D][-]
And then
[Tree A][C][Tree B]
Except we have a problem. The node two after D is our next root.
[B = Tree A][C][A][D][-][Roooooot?!]
It would be a lot easier on us if we could simply maintain a pointer to it instead of to it and C. Turns out, since we know it will point to C, we can go ahead and start constructing the node in the binary tree that will hold it, and as part of this we can enter C into it as a left-node. How can we do this elegantly?
Set the pointer of the Node under C to the node Under B.
It's cheating in every sense of the word, but by using this trick, we free up B.
Alternatively, you can be sane, and actually start building out the node structure. After all, you really can't reuse the nodes from the SLL, they're probably POD structs.
So now...
[TreeA]<-[C][A][D][-][B]
[TreeA]<-[C]->[TreeB][B]
And... Wait a sec. We can use this same trick to free up C, if we just let ourselves think of it as a single node instead of a tree. Because after all, it really is just a single node.
[TreeC]<-[B][A][D][-][C]
We can further generalize our tricks.
[TreeC]<-[B][TreeD]<-[C][-]<-[D][-][A]
[TreeC]<-[B][TreeD]<-[C]->[TreeE][A]
[TreeC]<-[B]->[TreeF][A]
[TreeG]<-[A][B][C][-][D]
[TreeG]<-[A][-]<-[C][-][D]
[TreeG]<-[A][TreeH]<-[D][B][C][-]
[TreeG]<-[A][TreeH]<-[D][-]<-[C][-][B]
[TreeG]<-[A][TreeJ]<-[B][-]<-[C][-][D]
[TreeG]<-[A][TreeJ]<-[B][TreeK]<-[D][-]<-[C][-]
[TreeG]<-[A][TreeJ]<-[B][TreeK]<-[D][-]<-[C][-]
We are missing a critical step!
[TreeG]<-[A]->([TreeJ]<-[B]->([TreeK]<-[D][-]<-[C][-]))
Becomes :
[TreeG]<-[A]->[TreeL->([TreeK]<-[D][-]<-[C][-])][B]
[TreeG]<-[A]->[TreeL->([TreeK]<-[D]->[TreeM])][B]
[TreeG]<-[A]->[TreeL->[TreeN]][B]
[TreeG]<-[A]->[TreeO][B]
[TreeP]<-[B]
Obviously, the algorithm can be cleaned up considerably, but I thought it would be interesting to demonstrate how one can optimize as you go by iteratively designing your algorithm. I think this kind of process is what a good employer should be looking for more than anything.
The trick, basically, is that each time we reach the next midpoint, which we know is a parent-to-be, we know that its left subtree is already finished. The other trick is that we are done with a node once it has two children and something pointing to it, even if all of the sub-trees aren't finished. Using this, we can get what I am pretty sure is a linear time solution, as each element is touched only 4 times at most. The problem is that this relies on being given a list that will form a truly balanced binary search tree. There are, in other words, some hidden constraints that may make this solution either much harder to apply, or impossible. For example, if you have an odd number of elements, or if there are a lot of non-unique values, this starts to produce a fairly silly tree.
Considerations:
Render the element unique.
Insert a dummy element at the end if the number of nodes is odd.
Sing longingly for a more naive implementation.
Use a deque to keep the roots of completed subtrees and the midpoints in, instead of mucking around with my second trick.
This is a python implementation:
def sll_to_bbst(sll, start, end):
"""Build a balanced binary search tree from sorted linked list.
This assumes that you have a class BinarySearchTree, with properties
'l_child' and 'r_child'.
Params:
sll: sorted linked list, any data structure with 'popleft()' method,
which removes and returns the leftmost element of the list. The
easiest thing to do is to use 'collections.deque' for the sorted
list.
start: int, start index, on initial call set to 0
end: int, on initial call should be set to len(sll)
Returns:
A balanced instance of BinarySearchTree
This is a python implementation of solution found here:
http://leetcode.com/2010/11/convert-sorted-list-to-balanced-binary.html
"""
if start >= end:
return None
middle = (start + end) // 2
l_child = sll_to_bbst(sll, start, middle)
root = BinarySearchTree(sll.popleft())
root.l_child = l_child
root.r_child = sll_to_bbst(sll, middle+1, end)
return root
Instead of the sorted linked list i was asked on a sorted array (doesn't matter though logically, but yes run-time varies) to create a BST of minimal height, following is the code i could get out:
typedef struct Node{
struct Node *left;
int info;
struct Node *right;
}Node_t;
Node_t* Bin(int low, int high) {
Node_t* node = NULL;
int mid = 0;
if(low <= high) {
mid = (low+high)/2;
node = CreateNode(a[mid]);
printf("DEBUG: creating node for %d\n", a[mid]);
if(node->left == NULL) {
node->left = Bin(low, mid-1);
}
if(node->right == NULL) {
node->right = Bin(mid+1, high);
}
return node;
}//if(low <=high)
else {
return NULL;
}
}//Bin(low,high)
Node_t* CreateNode(int info) {
Node_t* node = malloc(sizeof(Node_t));
memset(node, 0, sizeof(Node_t));
node->info = info;
node->left = NULL;
node->right = NULL;
return node;
}//CreateNode(info)
// call function for an array example: 6 7 8 9 10 11 12, it gets you desired
// result
Bin(0,6);
HTH Somebody..
This is the pseudo recursive algorithm that I will suggest.
createTree(treenode *root, linknode *start, linknode *end)
{
if(start == end or start = end->next)
{
return;
}
ptrsingle=start;
ptrdouble=start;
while(ptrdouble != end and ptrdouble->next !=end)
{
ptrsignle=ptrsingle->next;
ptrdouble=ptrdouble->next->next;
}
//ptrsignle will now be at the middle element.
treenode cur_node=Allocatememory;
cur_node->data = ptrsingle->data;
if(root = null)
{
root = cur_node;
}
else
{
if(cur_node->data (less than) root->data)
root->left=cur_node
else
root->right=cur_node
}
createTree(cur_node, start, ptrSingle);
createTree(cur_node, ptrSingle, End);
}
Root = null;
The inital call will be createtree(Root, list, null);
We are doing the recursive building of the tree, but without using the intermediate array.
To get to the middle element every time we are advancing two pointers, one by one element, other by two elements. By the time the second pointer is at the end, the first pointer will be at the middle.
The running time will be o(nlogn). The extra space will be o(logn). Not an efficient solution for a real situation where you can have R-B tree which guarantees nlogn insertion. But good enough for interview.
Similar to #Stuart Golodetz and #Jake Kurzer the important thing is that the list is already sorted.
In #Stuart's answer, the array he presented is the backing data structure for the BST. The find operation for example would just need to perform index array calculations to traverse the tree. Growing the array and removing elements would be the trickier part, so I'd prefer a vector or other constant time lookup data structure.
#Jake's answer also uses this fact but unfortunately requires you to traverse the list to find each time to do a get(index) operation. But requires no additional memory usage.
Unless it was specifically mentioned by the interviewer that they wanted an object structure representation of the tree, I would use #Stuart's answer.
In a question like this you'd be given extra points for discussing the tradeoffs and all the options that you have.
Hope the detailed explanation on this post helps:
http://preparefortechinterview.blogspot.com/2013/10/planting-trees_1.html
A slightly improved implementation from #1337c0d3r in my blog.
// create a balanced BST using #len elements starting from #head & move #head forward by #len
TreeNode *sortedListToBSTHelper(ListNode *&head, int len) {
if (0 == len) return NULL;
auto left = sortedListToBSTHelper(head, len / 2);
auto root = new TreeNode(head->val);
root->left = left;
head = head->next;
root->right = sortedListToBSTHelper(head, (len - 1) / 2);
return root;
}
TreeNode *sortedListToBST(ListNode *head) {
int n = length(head);
return sortedListToBSTHelper(head, n);
}
If you know how many nodes are in the linked list, you can do it like this:
// Gives path to subtree being built. If branch[N] is false, branch
// less from the node at depth N, if true branch greater.
bool branch[max depth];
// If rem[N] is true, then for the current subtree at depth N, it's
// greater subtree has one more node than it's less subtree.
bool rem[max depth];
// Depth of root node of current subtree.
unsigned depth = 0;
// Number of nodes in current subtree.
unsigned num_sub = Number of nodes in linked list;
// The algorithm relies on a stack of nodes whose less subtree has
// been built, but whose right subtree has not yet been built. The
// stack is implemented as linked list. The nodes are linked
// together by having the "greater" handle of a node set to the
// next node in the list. "less_parent" is the handle of the first
// node in the list.
Node *less_parent = nullptr;
// h is root of current subtree, child is one of its children.
Node *h, *child;
Node *p = head of the sorted linked list of nodes;
LOOP // loop unconditionally
LOOP WHILE (num_sub > 2)
// Subtract one for root of subtree.
num_sub = num_sub - 1;
rem[depth] = !!(num_sub & 1); // true if num_sub is an odd number
branch[depth] = false;
depth = depth + 1;
num_sub = num_sub / 2;
END LOOP
IF (num_sub == 2)
// Build a subtree with two nodes, slanting to greater.
// I arbitrarily chose to always have the extra node in the
// greater subtree when there is an odd number of nodes to
// split between the two subtrees.
h = p;
p = the node after p in the linked list;
child = p;
p = the node after p in the linked list;
make h and p into a two-element AVL tree;
ELSE // num_sub == 1
// Build a subtree with one node.
h = p;
p = the next node in the linked list;
make h into a leaf node;
END IF
LOOP WHILE (depth > 0)
depth = depth - 1;
IF (not branch[depth])
// We've completed a less subtree, exit while loop.
EXIT LOOP;
END IF
// We've completed a greater subtree, so attach it to
// its parent (that is less than it). We pop the parent
// off the stack of less parents.
child = h;
h = less_parent;
less_parent = h->greater_child;
h->greater_child = child;
num_sub = 2 * (num_sub - rem[depth]) + rem[depth] + 1;
IF (num_sub & (num_sub - 1))
// num_sub is not a power of 2
h->balance_factor = 0;
ELSE
// num_sub is a power of 2
h->balance_factor = 1;
END IF
END LOOP
IF (num_sub == number of node in original linked list)
// We've completed the full tree, exit outer unconditional loop
EXIT LOOP;
END IF
// The subtree we've completed is the less subtree of the
// next node in the sequence.
child = h;
h = p;
p = the next node in the linked list;
h->less_child = child;
// Put h onto the stack of less parents.
h->greater_child = less_parent;
less_parent = h;
// Proceed to creating greater than subtree of h.
branch[depth] = true;
num_sub = num_sub + rem[depth];
depth = depth + 1;
END LOOP
// h now points to the root of the completed AVL tree.
For an encoding of this in C++, see the build member function (currently at line 361) in https://github.com/wkaras/C-plus-plus-intrusive-container-templates/blob/master/avl_tree.h . It's actually more general, a template using any forward iterator rather than specifically a linked list.

Inorder tree traversal: Which definition is correct?

I have the following text from an academic course I took a while ago about inorder traversal (they also call it pancaking) of a binary tree (not BST):
Inorder tree traversal
Draw a line around the outside of the
tree. Start to the left of the root,
and go around the outside of the tree,
to end up to the right of the root.
Stay as close to the tree as possible,
but do not cross the tree. (Think of
the tree — its branches and nodes — as
a solid barrier.) The order of the
nodes is the order in which this line
passes underneath them. If you are
unsure as to when you go “underneath”
a node, remember that a node “to the
left” always comes first.
Here's the example used (slightly different tree from below)
However when I do a search on google, I get a conflicting definition. For example the wikipedia example:
Inorder traversal sequence: A, B, C,
D, E, F, G, H, I
(leftchild,rootnode,right node)
But according to (my understanding of) definition #1, this should be
A, B, D, C, E, F, G, I, H
Can anyone clarify which definition is correct? They might be both describing different traversal methods, but happen to be using the same name. I'm having trouble believing the peer-reviewed academic text is wrong, but can't be certain.
In my bad attempt at the drawing here's the order that shows how they should be picked.
pretty much pick the node that is directly above the line being drawn,.
Forget the definitions, it's so much easier to just apply the algorithm:
void inOrderPrint(Node root)
{
if (root.left != null) inOrderPrint(root.left);
print(root.name);
if (root.right != null) inOrderPrint(root.right);
}
It's just three lines. Rearrange the order for pre- and post- order.
If you read carefully you see that the first "definition" says to start left of the root and that the order of the nodes is determined by when you pass under them. So B is not the first node, as you pass it from the left on the way to A, then first pass under A after which you go up and pass under B. Therefore it seems that both definitions give the same result.
I personally found this lecture quite helpful.
Both definitions give the same result. Don't be fooled by the letters in the first example - look at the numbers along the path. The second example does use letters to denote the path - perhaps that is what is throwing you off.
For example, in your example order showing how you thought the second tree would be traversed using the algorithm of the first one, you place "D" after "B" but you shouldn't because there is still a left-hand child node of D available (that's why the first item says "the order in which this line passes underneath them."
this may be late but it could be useful for anyone later ..
u just need not to ignore the dummy or null nodes e.g the Node G has a left null node .. considering this null node will make every thing alright ..
The proper traversal would be: as far left as possible with leaf nodes (not root nodes)
Left Root Right
A B NULL
C D E
Null F G
H I NULL
F is root or left, i am not sure
I think the first binary tree with the root of a is a Binary tree which is not correctly constructed.
Try to implement so that all the left side of the tree is less than the root and all the right side of the tree is greater than or equal to the root.
But according to (my understanding of)
definition #1, this should be
A, B, D, C, E, F, G, I, H
Unfortunately, your understanding is wrong.
Whenever you arrive at a node, you must descend to an available left node, before you look at the current node, then you look at an available right node.
When you chose D before C, you didn't descend to the left node first.
Hey according to me as mentioned in wiki is correct the sequence for a inorder traversal is left-root-right.
Till A, B, C, D, E, F i think you have understood already. Now after root F the next node is G that doesn't hav a left node but a right node so as per the rule (left-root-right) its null-g-right. Now I is the right node of G but I has a left node hence the traversal would be GHI. This is correct.
Hope this helps.
For an inline tree traversal you have to keep in mind that the order of traversal is left-node-right. For the above diagram that you are conflicted on, your error occurs when you read a parent node before reading any leaf(children) nodes to the left.
The proper traversal would be: as far left as possible with leaf nodes(A), return to parent node(B), move to the right, but since D has a child to its left you move down again(C), back up to C's parent(D), to D's right child(E), reverse back to the root(F), move to the right leaf(G), move to G's leaf but since it has a left leaf node move there(H), return to parent(I).
the above traversal reads the node when I have it listed in parenthesis.
package datastructure;
public class BinaryTreeTraversal {
public static Node<Integer> node;
public static Node<Integer> sortedArrayToBST(int arr[], int start, int end) {
if (start > end)
return null;
int mid = start + (end - start) / 2;
Node<Integer> node = new Node<Integer>();
node.setValue(arr[mid]);
node.left = sortedArrayToBST(arr, start, mid - 1);
node.right = sortedArrayToBST(arr, mid + 1, end);
return node;
}
public static void main(String[] args) {
int[] test = new int[] { 1, 2, 3, 4, 5, 6, 7 };
Node<Integer> node = sortedArrayToBST(test, 0, test.length - 1);
System.out.println("preOrderTraversal >> ");
preOrderTraversal(node);
System.out.println("");
System.out.println("inOrderTraversal >> ");
inOrderTraversal(node);
System.out.println("");
System.out.println("postOrderTraversal >> ");
postOrderTraversal(node);
}
public static void preOrderTraversal(Node<Integer> node) {
if (node != null) {
System.out.print(" " + node.toString());
preOrderTraversal(node.left);
preOrderTraversal(node.right);
}
}
public static void inOrderTraversal(Node<Integer> node) {
if (node != null) {
inOrderTraversal(node.left);
System.out.print(" " + node.toString());
inOrderTraversal(node.right);
}
}
public static void postOrderTraversal(Node<Integer> node) {
if (node != null) {
postOrderTraversal(node.left);
postOrderTraversal(node.right);
System.out.print(" " + node.toString());
}
}
}
package datastructure;
public class Node {
E value = null;
Node<E> left;
Node<E> right;
public E getValue() {
return value;
}
public void setValue(E value) {
this.value = value;
}
public Node<E> getLeft() {
return left;
}
public void setLeft(Node<E> left) {
this.left = left;
}
public Node<E> getRight() {
return right;
}
public void setRight(Node<E> right) {
this.right = right;
}
#Override
public String toString() {
return " " +value;
}
}
preOrderTraversal >>
4 2 1 3 6 5 7
inOrderTraversal >>
1 2 3 4 5 6 7
postOrderTraversal >>
1 3 2 5 7 6 4
void
inorder (NODE root)
{
if (root != NULL)
{
inorder (root->llink);
printf ("%d\t", root->info);
inorder (root->rlink);
}
}
This the most simplest approach to recursive definition of in-order traversal, just call this function in the main function to get the in-order traversal of a given binary tree.
It is correct for preorder,nt for inorder

Resources