Evaluating expression trees - algorithm

Skiena's book on Algorithm contains the following question:
1) Evaluate expression given as binary tree in O(n) time, given n nodes.
2) Evaluate expression given as DAG in O(n+m) time, given n nodes and m edges in DAG.
I could think of a way for the first question:
evaluate(node) {
if(node has children){
left_val = evaluate(node->left);
right_val = evaluate(node->right);
// find operation symbol for node and use it as
// val = left_val operation right_val
return val;
}
else {
return node_value;
}
}
Since we visit each node once, it will take O(n) time.
Since the book has no solutions, can anyone please tell if this is correct ?
Also can anyone suggest a solution for second question.
Thanks.

First way looks fine to me.
For the DAG, if you can modify the tree to add cached values to each node, you can use the same algorithm with a small tweak to not recurse if an operator node has a cached value. This should be O(n+m) time (at most one arithmetic operation per node and at most one pointer lookup per edge). Explicitly:
evaluate(node) {
if (node has value) {
return node->value;
} else {
left = evaluate(node->left);
right = evaluate(node->right);
// find operation symbol for node and use it as
// val = left_val operation right_val
node->value = val;
return val;
}
}

Related

Non recursive DFS algorithm for simple paths between two points

I am looking for a non-recursive Depth first search algorithm to find all simple paths between two points in undirected graphs (cycles are possible).
I checked many posts, all showed recursive algorithm.
seems no one interested in non-recursive version.
a recursive version is like this;
void dfs(Graph G, int v, int t)
{
path.push(v);
onPath[v] = true;
if (v == t)
{
print(path);
}
else
{
for (int w : G.adj(v))
{
if (!onPath[w])
dfs(G, w, t);
}
}
path.pop();
onPath[v] = false;
}
so, I tried it as (non-recursive), but when i check it, it computed wrong
void dfs(node start,node end)
{
stack m_stack=new stack();
m_stack.push(start);
while(!m_stack.empty)
{
var current= m_stack.pop();
path.push(current);
if (current == end)
{
print(path);
}
else
{
for ( node in adj(current))
{
if (!path.contain(node))
m_stack.push(node);
}
}
path.pop();
}
the test graph is:
(a,b),(b,a),
(b,c),(c,b),
(b,d),(d,b),
(c,f),(f,c),
(d,f),(f,d),
(f,h),(h,f).
it is undirected, that is why there are (a,b) and (b,a).
If the start and end nodes are 'a' and 'h', then there should be two simple paths:
a,b,c,f,h
a,b,d,f,h.
but that algorithm could not find both.
it displayed output as:
a,b,d,f,h,
a,b,d.
stack become at the start of second path, that is the problem.
please point out my mistake when changing it to non-recursive version.
your help will be appreciated!
I think dfs is a pretty complicated algorithm especially in its iterative form. The most important part of the iterative version is the insight, that in the recursive version not only the current node, but also the current neighbour, both are stored on the stack. With this in mind, in C++ the iterative version could look like:
//graph[i][j] stores the j-th neighbour of the node i
void dfs(size_t start, size_t end, const vector<vector<size_t> > &graph)
{
//initialize:
//remember the node (first) and the index of the next neighbour (second)
typedef pair<size_t, size_t> State;
stack<State> to_do_stack;
vector<size_t> path; //remembering the way
vector<bool> visited(graph.size(), false); //caching visited - no need for searching in the path-vector
//start in start!
to_do_stack.push(make_pair(start, 0));
visited[start]=true;
path.push_back(start);
while(!to_do_stack.empty())
{
State &current = to_do_stack.top();//current stays on the stack for the time being...
if (current.first == end || current.second == graph[current.first].size())//goal reached or done with neighbours?
{
if (current.first == end)
print(path);//found a way!
//backtrack:
visited[current.first]=false;//no longer considered visited
path.pop_back();//go a step back
to_do_stack.pop();//no need to explore further neighbours
}
else{//normal case: explore neighbours
size_t next=graph[current.first][current.second];
current.second++;//update the next neighbour in the stack!
if(!visited[next]){
//putting the neighbour on the todo-list
to_do_stack.push(make_pair(next, 0));
visited[next]=true;
path.push_back(next);
}
}
}
}
No warranty it is bug-free, but I hope you get the gist and at least it finds the both paths in your example.
The path computation is all wrong. You pop the last node before you process it's neighbors. Your code should output just the last node.
The simplest fix is to trust the compiler to optimize the recursive solution sufficiently that it won't matter. You can help by not passing large objects between calls and by avoiding allocating/deallocating many objects per call.
The easy fix is to store the entire path in the stack (instead of just the last node).
A harder fix is that you have 2 types of nodes on the stack. Insert and remove. When you reach a insert node x value you add first remove node x then push to the stack insert node y for all neighbours y. When you hit a remove node x you need to pop the last value (x) from the path. This better simulates the dynamics of the recursive solution.
A better fix is to just do breadth-first-search since that's easier to implement in an iterative fashion.

How to store visited states in iterative deepening / depth limited search?

Update: Search for the first solution.
for a normal Depth First Search it is simple, just use a hashset
bool DFS (currentState) =
{
if (myHashSet.Contains(currentState))
{
return;
}
else
{
myHashSet.Add(currentState);
}
if (IsSolution(currentState) return true;
else
{
for (var nextState in GetNextStates(currentState))
if (DFS(nextState)) return true;
}
return false;
}
However, when it becomes depth limited, i cannot simply do this
bool DFS (currentState, maxDepth) =
{
if (maxDepth = 0) return false;
if (myHashSet.Contains(currentState))
{
return;
}
else
{
myHashSet.Add(currentState);
}
if (IsSolution(currentState) return true;
else
{
for (var nextState in GetNextStates(currentState))
if (DFS(nextState, maxDepth - 1)) return true;
}
return false;
}
Because then it is not going to do a complete search (in a sense of always be able to find a solution if there is any) before maxdepth
How should I fix it? Would it add more space complexity to the algorithm?
Or it just doesn't require to memoize the state at all.
Update:
for example, a decision tree is the following:
A - B - C - D - E - A
|
F - G (Goal)
Starting from state A. and G is a goal state. Clearly there is a solution under depth 3.
However, using my implementation under depth 4, if the direction of search happens to be
A(0) -> B(1) -> C(2) -> D(3) -> E(4) -> F(5) exceeds depth, then it would do back track to A, however E is visited, it would ignore the check direction A - E - F - G
I had the same problem as yours, here's my thread Iterative deepening in common lisp
Basically to solve this problem using hashtables, you can't just check if a node was visited before or not, you have to also consider the depth at which it was previously visited. If the node you're about to examine contains a state that was not previously seen, or it was seen before but at a higher depth, then you should still consider it since it may lead to a shallower solution which is what iterative deepening supposed to do, it returns the same solution that BFS would return, which would be the shallowest. So in the hashtable you can have the state as the key, and the depth as the value. You will need to keep updating the depth value in the hashtable after finding a shallower node though.
An alternative solution for cycle checking would be to backtrack on the path from the current node up to the root, if the node you're about to examine already appears on the path, then it will lead to a cycle. This approach would be more generic, and can be used with any search strategy. It is slower than the hashtable approach though, having O(d) time complexity where d is the depth, but the memory complexity will be greatly reduced.
In each step of IDFS, you are actually searching for a path which is shortest, you can't simple use hashSet. HashSet helps only when you are searching for the existence of a path where the length is unlimited.
In this case, you should probably use hashMap to store the minimum step to reach the state and prune the branch only if the map value can't be updated. The time complexity may changed in correspond.
But in fact, IDFS is used in place of BFS when the space is limited. As hashing the state may take almost as many space as BFS, usually you can't store the all the state in IDFS trivially.
The IDFS in wiki dose not have a hash neither. http://en.wikipedia.org/wiki/Iterative_deepening_depth-first_search
So let's drop out the hash and trade time for space!
Update
It's worthwhile to store the state in the current dfs stack, then the search path would not result into a trivial circle. The psudocode implementing this feature would be:
bool DFS (currentState, maxDepth) =
{
if (maxDepth = 0) return false;
if (myHashSet.Contains(currentState))
{
return;
}
else
{
myHashSet.Add(currentState);
}
if (IsSolution(currentState) return true;
else
{
for (var nextState in GetNextStates(currentState))
if (DFS(nextState, maxDepth - 1)) return true;
}
myHashSet.Remove(currentState); //the state is pop out from the stack
return false;
}
The solution you show is perfectly fine and works for DFSID(depth-first search with iterative deepening). Just do not forget to clear myHashSet before increasing the depth.

Removing duplicate subtrees from binary tree

I have to design an algorithm under the additional homework. This algorithm have to compress binary tree by transforming it into DAG by removing repetitive subtrees and redirecting all these connections to one left original subtree. For instance I've got a tree (I'm giving the nodes preorder):
1 2 1 3 2 1 3
The algorithm have to remove right connection (right subtree that means 2 1 3) of 1 (root) and redirect it to left connection (because these substrees are the same and left was first in preorder so we leave only the left)
The way I see it: I'm passing the tree preorder. For current node 'w', I start recursion that have to detect (if there exist) the original subtree equals to the subtree with root 'w'. I'm cutting the recursion if I find equal subtree (and I do what must be done) or when I get to 'w' in my finding the same subtrees recursion. Of course I predict some small improvements like comparing only subtrees with equal number of nodes.
If I'm not wrong it gives complexity O(n^2) where n is number of nodes of given binary tree. Is there any chance to do it faster (I think it is). Is the linear algorithm possible?
Pity that my algorithm finally has complexity O(n^3). Your answers with hashing probably will be very useful for me after some time, when I will know much more.. For now it's too difficult for me..
The last question. Is there any chance to do it in O(n^2) using elementary techniques (not hashing)?
This happens when constructing oBDDs. The Idea is: put the tree into a canonical form, and construct a hashtable with an entry for every node. Hash function is a function of the node + the hash functions for the left/right child nodes. Complexity is O(N), but only if one can rely on the hashvalues being unique. The final compare (e.g. for Resolving collisions) will still cost o(N*N) for the recursive subtree <--> subtree compare.
More on BDDs or the original Bryant paper
The hashfunction I currently use:
#define SHUFFLE(x,n) (((x) << (n))|((x) >>(32-(n))))
/* a node's hashvalue is based on its value
* and (recursively) on it's children's hashvalues.
*/
#define NODE_HASH2(l,r) ((SHUFFLE((l),5)^SHUFFLE((r),9)))
#define NODE_HASH3(v,l,r) ((0x54321u*(v) ^ NODE_HASH2((l),(r))))
Typical usage:
void node_sethash(NodeNum num)
{
if (NODE_IS_NULL(num)) return;
if (NODE_IS_TERMINAL(num)) switch (nodes[num].var) {
case 0: nodes[num].hash.hash= HASH_FALSE; break;
case 1: nodes[num].hash.hash= HASH_TRUE; break;
case 2: nodes[num].hash.hash= HASH_FALSE^HASH_TRUE; break;
}
else if (NODE_IS_NAMED(num)) {
NodeNum f,t;
f = nodes[num].negative;
t = nodes[num].positive;
nodes[num].hash.hash = NODE_HASH3 (nodes[num].var, nodes[f].hash.hash, nodes[t].hash.hash);
}
return ;
}
Searching the hash table:
NodeNum *hash_hnd(NodeNum num, int want_exact)
{
unsigned slot;
NodeNum *ptr, this;
if (NODE_IS_NULL(num)) return NULL;
slot = nodes[num].hash.hash % COUNTOF(hash_nodes);
for (ptr = &hash_nodes[slot]; !NODE_IS_NULL(this= *ptr); ptr = &nodes[this].hash.link) {
if (this == num) break;
if (want_exact) continue;
if (nodes[this].hash.hash != nodes[num].hash.hash) continue;
if (nodes[this].var != nodes[num].var) continue;
if (node_compare( nodes[this].negative , nodes[num].negative)) continue;
if (node_compare( nodes[this].positive , nodes[num].positive)) continue;
/* duplicate node := same var+same children */
break;
}
return ptr;
}
The recursive compare function:
int node_compare(NodeNum one, NodeNum two)
{
int rc;
if (one == two) return 0;
if (NODE_IS_NULL(one) && NODE_IS_NULL(two)) return 0;
if (NODE_IS_NULL(one) && !NODE_IS_NULL(two)) return -1;
if (!NODE_IS_NULL(one) && NODE_IS_NULL(two)) return 1;
if (NODE_IS_TERMINAL(one) && !NODE_IS_TERMINAL(two)) return -1;
if (!NODE_IS_TERMINAL(one) && NODE_IS_TERMINAL(two)) return 1;
if (VAR_RANK(nodes[one].var) < VAR_RANK(nodes[two].var) ) return -1;
if (VAR_RANK(nodes[one].var) > VAR_RANK(nodes[two].var) ) return 1;
rc = node_compare(nodes[one].negative,nodes[two].negative);
if (rc) return rc;
rc = node_compare(nodes[one].positive,nodes[two].positive);
if (rc) return rc;
return 0;
}
This is a problem commonly solved to do common sub-expression elimination in programming languages.
The approach is as follows (and is easily generalized to more than 2 children in a node):
Algorithm (Assumes mutable tree structure; You can easily build a new tree along the way):
MakeDAG(tree):
HASH = a new hash-table-based dictionary
foreach subtree NODE in the tree // traverse this however you like
if NODE is in HASH
replace NODE with HASH[NODE]
else
HASH[NODE] = N // insert the current node, N, in the dictionary
To compute the hash code for a node, you need to recursively compute the hash nodes until you reach the leaves of the tree.
Simply calculating these hash codes naively will bump up your runtime to O(n^2).
It is crucial that you store the results on your way down the tree to avoid repeated recursive calls and to improve the runtime to O(n).
I would go with a hashing approach.
A hash for a leaf is its value mod P_1. Hash for a node is (value+hash(left_son)*P_2+hash(right_son)*P_2^2) mod P_1, where P_1, P_2 are primes. If you count those hashes for at least 5 different big prime pairs(by big i mean something near 10^8-10^9, so you can do your math without overflowing), you can safely assume that nodes with same hashes are the same.
Then you can walk the tree, checking sons, first and do your transform. This will work in O(n) time.
NOTE that you can use other hash functions, like (value + hash(left_son)*P_2 + hash(right_son)*P_3) mod P_1, etc.

How to find the rank of a node in an AVL tree?

I need to implement two rank queries [rank(k) and select(r)]. But before I can start on this, I need to figure out how the two functions work.
As far as I know, rank(k) returns the rank of a given key k, and select(r) returns the key of a given rank r.
So my questions are:
1.) How do you calculate the rank of a node in an AVL(self balancing BST)?
2.) Is it possible for more than one key to have the same rank? And if so, what woulud select(r) return?
I'm going to include a sample AVL tree which you can refer to if it helps answer the question.
Thanks!
Your question really boils down to: "how is the term 'rank' normally defined with respect to an AVL tree?" (and, possibly, how is 'select' normally defined as well).
At least as I've seen the term used, "rank" means the position among the nodes in the tree -- i.e., how many nodes are to its left. You're typically given a pointer to a node (or perhaps a key value) and you need to count the number of nodes to its left.
"Select" is basically the opposite -- you're given a particular rank, and need to retrieve a pointer to the specified node (or the key for that node).
Two notes: First, since neither of these modifies the tree at all, it makes no real difference what form of balancing is used (e.g., AVL vs. red/black); for that matter a tree with no balancing at all is equivalent as well. Second, if you need to do this frequently, you can improve speed considerably by adding an extra field to each node recording how many nodes are to its left.
Rank is the number of nodes in the Left sub tree plus one, and is calculated for every node. I believe rank is not a concept specific to AVL trees - it can be calculated for any binary tree.
Select is just opposite to rank. A rank is given and you have to return a node matching that rank.
The following code will perform rank calculation:
void InitRank(struct TreeNode *Node)
{
if(!Node)
{
return;
}
else
{ Node->rank = 1 + NumeberofNodeInTree(Node->LChild);
InitRank(Node->LChild);
InitRank(Node->RChild);
}
}
int NumeberofNodeInTree(struct TreeNode *Node)
{
if(!Node)
{
return 0;
}
else
{
return(1+NumeberofNodeInTree(Node->LChild)+NumeberofNodeInTree(Node->RChild));
}
}
Here is the code i wrote and worked fine for AVL Tree to get the rank of a particular value. difference is just you used a node as parameter and i used a key a parameter. you can modify this as your own way. Sample code:
public int rank(int data){
return rank(data,root);
}
private int rank(int data, AVLNode r){
int rank=1;
while(r != null){
if(data<r.data)
r = r.left;
else if(data > r.data){
rank += 1+ countNodes(r.left);
r = r.right;
}
else{
r.rank=rank+countNodes(r.left);
return r.rank;
}
}
return 0;
}
[N.B] If you want to start your rank from 0 then initialize variable rank=0.
you definitely should have implemented the method countNodes() to execute this code.

How to determine if a linked list has a cycle using only two memory locations

Does anyone know of an algorithm to find if a linked list loops on itself using only two variables to traverse the list. Say you have a linked list of objects, it doesn't matter what type of object. I have a pointer to the head of the linked list in one variable and I am only given one other variable to traverse the list with.
So my plan is to compare pointer values to see if any pointers are the same. The list is of finite size but may be huge. I can set both variable to the head and then traverse the list with the other variable, always checking if it is equal to the other variable, but, if I do hit a loop I will never get out of it. I'm thinking it has to do with different rates of traversing the list and comparing pointer values. Any thoughts?
I would suggest using Floyd's Cycle-Finding Algorithm aka The Tortoise and the Hare Algorithm. It has O(n) complexity and I think it fits your requirements.
Example code:
function boolean hasLoop(Node startNode){
Node slowNode = Node fastNode1 = Node fastNode2 = startNode;
while (slowNode && fastNode1 = fastNode2.next() && fastNode2 = fastNode1.next()){
if (slowNode == fastNode1 || slowNode == fastNode2) return true;
slowNode = slowNode.next();
}
return false;
}
More info on Wikipedia: Floyd's cycle-finding algorithm.
You can use the Turtle and Rabbit algorithm.
Wikipedia has an explanation too, and they call it "Floyd's cycle-finding algorithm" or "Tortoise and hare"
Absolutely. One solution indeed can be traversing the list with both pointers, one travelling at twice the rate of the other.
Start with the 'slow' and the 'fast' pointer pointing to any location in the list. Run the traversal loop. If the 'fast' pointer at any time comes to coincide with the slow pointer, you have a circular linked list.
int *head = list.GetHead();
if (head != null) {
int *fastPtr = head;
int *slowPtr = head;
bool isCircular = true;
do
{
if (fastPtr->Next == null || fastPtr->Next->Next == null) //List end found
{
isCircular = false;
break;
}
fastPtr = fastPtr->Next->Next;
slowPtr = slowPtr->Next;
} while (fastPtr != slowPtr);
//Do whatever you want with the 'isCircular' flag here
}
I tried to solve this myself and found a different (less efficient but still optimal) solution.
The idea is based on reversing a singly linked list in linear time. This can be done by doing two swaps at each step in iterating over the list. If q is the previous element (initially null) and p is the current, then swap(q,p->next) swap(p,q) will reverse the link and advance the two pointers at the same time. The swaps can be done using XOR to prevent having to use a third memory location.
If the list has a cycle then at one point during the iteration you will arrive at a node whose pointer has already been changed. You cannot know which node that is, but by continuing the iteration, swapping some elements twice, you arrive at the head of the list again.
By reversing the list twice, the list remains unchanged in result and you can tell if it had a cycle based on whether you arrived at the original head of the list or not.
int isListCircular(ListNode* head){
if(head==NULL)
return 0;
ListNode *fast=head, *slow=head;
while(fast && fast->next){
if(fast->next->next==slow)
return 1;
fast=fast->next->next;
slow=slow->next;
}
return 0;
}
boolean findCircular(Node *head)
{
Node *slower, * faster;
slower = head;
faster = head->next;
while(true) {
if ( !faster || !faster->next)
return false;
else if (faster == slower || faster->next == slower)
return true;
else
faster = faster->next->next;
}
}
Taking this problem to a next step will be identifying the cycle (that is, not just that the cycle exists, but where exactly it is in the list).
Tortoise and Hare algorithm can be used for the same, however, we will require to keep track of the head of the list at all times. An illustration of this algorithm can be found here.

Resources