I want to write pseudo-code for visiting a state machine using DFS. State machine can be considered as a directed graph. Following algorithm from Cormen book uses a DFS algorithm to visit a graph.
DFS-VISIT(G, u) //G= graph, u=root vertex
u.color = GRAY
for each v from G.Adjacents(u) //edge (u, v)
if v.color == WHITE
DFS-VISIT(G, v)
State machine however can have multiple edges between the two vertices. And above algorithm stores edges in an adjacency list. I have implemented the algorithm in Java with following classes,
class Node{
String name;
....
ArrayList<Transition> transitions;
void addTransition(Transition tr);
}
class Transition
{
String src;
String dest;
}
With the above given information, I have built a state machine with node and transition objects. I want to modify the above algorithm where I don't have a graph object G. I just have access to root object.
Can I modify the above algorithm like below?
DFS-VISIT(root) //root node
root.color = GRAY
for each t from root.getTransitions() //t=transition
v = t.getDestination() //v=destination of t
if v.color == WHITE
DFS-VISIT(v)
The algorithm is independent of implementation. Actually, it's the other way around. The question you should be asking is this: "do your proprietary implementation has all the exact properties that are required by the algorithm?"
The algorithm has very few strict demands. You need a set of nodes and a set of edges, where each edge connects 2 nodes. That's the generic definition of graph. How you acquire, store and process these sets is irrelevant to the algorithm. For the algorithm step all you need is access to a given node from the set and access to a set of its neighbors. What you've presented seems fine (of course, for the next step you'll need to progress to the next node after root).
Related
I'm referring to the link below to try and write an algorithm to find a vertex in a tree so that removing that vertex gives connected components with the size of each component being at most V/2 vertices.
https://math.stackexchange.com/questions/1742440/you-can-always-delete-a-vertex-from-a-tree-g-such-that-the-remaining-connected
I do understand the proof given in the accepted answer which uses arrows to find that vertex. I can't quite figure out how to write an algorithm for the same.
I will just explain the proof and then later on give the pseudo code so that you can understand the psuedo code easily. The vertex that you are looking for is called centroid. So basically we need to find the centroid of the tree.
First of all this needs to be clear that there can be only one node that satisfies this property.
Let the given tree be T. Start from any vertex claiming to be the required vertex. Then check whether this is true or not. If this is the required vertex then nothing needs to be done. If this is not the vertex then select the next vertex adjacent to the current vertex that is the part of the subtree which had more than n/2 vertices in it. Repeat the process until you find the answer.
Now the pseudo code. Here are the meaning of the variables used.
v_centroid stores the centroid
v[i] stores the list of all nodes that are connected to i
size[i] stores the size of subtree of i.
v_centroid = any vertex
dfs(v_centroid,parent) // v_centroid is the assumed centroid and parent is parent of the node processing. For initial call you can use -1 as parent or any other undefined value suitable.
v_centroid = findCentroid(v_centroid,v_centroid)
func dfs(int node, int parent)
size[node] := 1
for i in v[node]
if(*i not equals parent)
dfs(*i, node)
size[node] = size[node] + size[parent]
end if
end for
end func
func findCentroid(int node, int parent)
for i in v[node]
if(i not equals parent and size[i]>MAX_SIZE/2)
return findCentroid(i, node)
end if
end for
return node
end func
I am learning graph traversal from The Algorithm Design Manual by Steven S. Skiena. In his book, he has provided the code for traversing the graph using dfs. Below is the code.
dfs(graph *g, int v)
{
edgenode *p;
int y;
if (finished) return;
discovered[v] = TRUE;
time = time + 1;
entry_time[v] = time;
process_vertex_early(v);
p = g->edges[v];
while (p != NULL) {
/* temporary pointer */
/* successor vertex */
/* allow for search termination */
y = p->y;
if (discovered[y] == FALSE) {
parent[y] = v;
process_edge(v,y);
dfs(g,y);
}
else if ((!processed[y]) || (g->directed))
process_edge(v,y);
}
if (finished) return;
p = p->next;
}
process_vertex_late(v);
time = time + 1;
exit_time[v] = time;
processed[v] = TRUE;
}
In a undirected graph, it looks like below code is processing the edge twice (calling the method process_edge(v,y). One while traversing the vertex v and another at processing the vertex y) . So I have added the condition parent[v]!=y in else if ((!processed[y]) || (g->directed)). It processes the edge only once. However, I am not sure how to modify this code to work with the parallel edge and self-loop edge. The code should process the parallel edge and self-loop.
Short Answer:
Substitute your (parent[v]!=y) for (!processed[y]) instead of adding it to the condition.
Detailed Answer:
In my opinion there is a mistake in the implementation written in the book, which you discovered and fixed (except for parallel edges. More on that below). The implementation is supposed to be correct for both directed and undeirected graphs, with the distinction between them recorded in the g->directed boolean property.
In the book, just before the implementation the author writes:
The other important property of a depth-first search is that it partitions the
edges of an undirected graph into exactly two classes: tree edges and back edges. The
tree edges discover new vertices, and are those encoded in the parent relation. Back
edges are those whose other endpoint is an ancestor of the vertex being expanded,
so they point back into the tree.
So the condition (!processed[y]) is supposed to handle undirected graphs (as the condition (g->directed) is to handle directed graphs) by allowing the algorithm to process the edges that are back-edges and preventing it from re-process those that are tree edges (in the opposite direction). As you noticed, though, the tree-edges are treated as back-edges when read through the child with this condition so you should just replace this condition with your suggested (parent[v]!=y).
The condition (!processed[y]) will ALWAYS be true for an undirected graph when the algorithm reads it as long as there are no parallel edges (further details why this is true - *). If there are parallel edges - those parallel edges that are read after the first "copy" of them will yield false and the edge will not be processed, when it should be. Your suggested condition, however, will distinguish between tree-edges and the rest (back-edges, parallel edges and self-loops) and allow the algorithm to process only those that are not tree-edges in the opposite direction.
To refer to self-edges, they should be fine both with the new and old conditions: they are edges with y==v. Getting to them, y is discovered (because v is discovered before going through its edges), not processed (v is processed only as the last line - after going through its edges) and it is not v's parent (v is not its own parent).
*Going through v's edges, the algorithm reads this condition for y that has been discovered (so it doesn't go into the first conditional block). As quoted above (in the book there is a semi-proof for that as well which I will include at the end of this footnote), p is either a tree-edge or a back-edge. As y is discovered, it cannot be a tree-edge from v to y. It can be a back edge to an ancestor which means the call is in a recursion call that started processing this ancestor at some point, and so the ancestor's call has yet to reach the final line, marking it as processed (so it is still marked as not processed) and it can be a tree-edge from y to v, in which case the same situation holds - and y is still marked as not processed.
The semi-proof for every edge being a tree-edge or a back-edge:
Why can’t an edge go to a brother or cousin node instead of an ancestor?
All nodes reachable from a given vertex v are expanded before we finish with the
traversal from v, so such topologies are impossible for undirected graphs.
You are correct.
Quoting the book's (2nd edition) errata:
(*) Page 171, line -2 -- The dfs code has a bug, where each tree edge
is processed twice in undirected graphs. The test needs to be
strengthed to be:
else if (((!processed[y]) && (parent[v]!=y)) || (g->directed))
As for cycles - see here
I'm working on a small drawing application in Java. I'm trying to create a 'bucket-fill' tool by implementing the Flood Fill algorithm.
I tried using a recursion implementation, but it was problematic. Anyway, I searched around the web and it seems that for this purpose, a non-recursive implementation of this algorithm is recommended.
So I ask you:
Could you describe a non-recursive implementation of the Flood Fill algorithm? An actual code example, some pseudo-code, or even a general explanation will all be welcome.
I'm looking for simplest, or the most efficient implementation you can think of.
(Doesn't have to be Java specific).
Thank you
I'm assuming that you have some sort of a grid where you receive the coordinates of the location from where you would like to fill the area.
Recursive flood fill algorithm is DFS. You can do a BFS to convert it to nonrecursive.
Basically the idea is similar in both the algorithms. You have a bag in which the nodes that are yet to be seen are kept. You remove a node from the bag and put the valid neighbors of the node back into the bag.
If the bag is a stack you get a DFS. If it's a queue you get a BFS.
the pseudocode is roughly this.
flood_fill(x,y, check_validity)
//here check_validity is a function that given coordinates of the point tells you whether
//the point should be colored or not
Queue q
q.push((x,y))
while (q is not empty)
(x1,y1) = q.pop()
color(x1,y1)
if (check_validity(x1+1,y1))
q.push(x1+1,y1)
if (check_validity(x1-1,y1))
q.push(x1-1,y1)
if (check_validity(x1,y1+1))
q.push(x1,y1+1)
if (check_validity(x1,y1-1))
q.push(x1,y1-1)
NOTE: make sure that check_validity takes into account whether the point is already colored or not.
DFS: Depth First Search
BFS: Breadth First Search
You basically have two ways to implement a flood fill algorithm non-recursively. The first method has been clearly explained by sukunrt in which you use a queue to implement breadth first search.
Alternatively, you can implement the recursive DFS non-recursively by using an implicit stack. For example, the following code implements a non-recursive DFS on a graph that has nodes as integers. In this code you use an array of Iterator to keep track of the processed neighbors in every node's adjacency list. The complete code can be accessed here.
public NonrecursiveDFS(Graph G, int s) {
marked = new boolean[G.V()];
// to be able to iterate over each adjacency list, keeping track of which
// vertex in each adjacency list needs to be explored next
Iterator<Integer>[] adj = (Iterator<Integer>[]) new Iterator[G.V()];
for (int v = 0; v < G.V(); v++)
adj[v] = G.adj(v).iterator();
// depth-first search using an explicit stack
Stack<Integer> stack = new Stack<Integer>();
marked[s] = true;
stack.push(s);
while (!stack.isEmpty()) {
int v = stack.peek();
if (adj[v].hasNext()) {
int w = adj[v].next();
if (!marked[w]) {
// discovered vertex w for the first time
marked[w] = true;
// edgeTo[v] = w;
stack.push(w);
}
}
else {
// v's adjacency list is exhausted
stack.pop();
}
}
}
How can we detect if a directed graph is cyclic? I thought using breadth first search, but I'm not sure. Any ideas?
What you really need, I believe, is a topological sorting algorithm like the one described here:
http://en.wikipedia.org/wiki/Topological_sorting
If the directed graph has a cycle then the algorithm will fail.
The comments/replies that I've seen so far seem to be missing the fact that in a directed graph there may be more than one way to get from node X to node Y without there being any (directed) cycles in the graph.
Usually depth-first search is used instead. I don't know if BFS is applicable easily.
In DFS, a spanning tree is built in order of visiting. If a the ancestor of a node in the tree is visited (i.e. a back-edge is created), then we detect a cycle.
See http://www.cs.nyu.edu/courses/summer04/G22.1170-001/6a-Graphs-More.pdf for a more detailed explanation.
Use DFS to search if any path is cyclic
class Node<T> { T value; List<Node<T>> adjacent; }
class Graph<T>{
List<Node<T>> nodes;
public boolean isCyclicRec()
{
for (Node<T> node : nodes)
{
Set<Node<T>> initPath = new HashSet<>();
if (isCyclicRec(node, initPath))
{
return true;
}
}
return false;
}
private boolean isCyclicRec(Node<T> currNode, Set<Node<T>> path)
{
if (path.contains(currNode))
{
return true;
}
else
{
path.add(currNode);
for (Node<T> node : currNode.adjacent)
{
if (isCyclicRec(node, path))
{
return true;
}
else
{
path.remove(node);
}
}
}
return false;
}
approach:1
how about a level no assignment to detect a cycle. eg: consider the graph below. A->(B,C) B->D D->(E,F) E,F->(G) E->D As you perform a DFS start assigning a level no to the node you visit (root A=0). level no of node = parent+1. So A=0, B=1, D=2, F=3, G=4 then, recursion reaches D, so E=3. Dont mark level for G (G already a level no assigned which is grater than E) Now E also has an edge to D. So levelization would say D should get a level no of 4. But D already has a "lower level" assigned to it of 2. Thus any time you attempt to assign a level number to a node while doing DFS that already has a lower level number set to it, you know the directed graph has a cycle..
approach2:
use 3 colors. white, gray, black. color only white nodes, white nodes to gray as you go down the DFS, color gray nodes to black when recursion unfolds (all children are processed). if not all children yet processed and you hit a gray node thats a cycle.
eg: all white to begin in above direct graph.
color A, B, D, F,G are colored white-gray. G is leaf so all children processed color it gray to black. recursion unfolds to F(all children processed) color it black. now you reach D, D has unprocessed children, so color E gray, G already colored black so dont go further down. E also has edge to D, so while still processing D (D still gray), you find an edge back to D(a gray node), a cycle is detected.
Testing for Topological sort over the given graph will lead you to the solution. If the algorithm for topsort, i.e the edges should always be directed in one way fails, then it means that the graph contains cycles.
Another simple solution would be a mark-and-sweep approach. Basically, for each node in tree you flag it as "visited" and then move on to it's children. If you ever see a node with the "visted" flag set, you know there's a cycle.
If modifying the graph to include a "visited" bit isn't possible, a set of node pointers can be used instead. To flag a node as visited, you place a pointer to it in the set. If the pointer is already in the set, there's a cycle.
Is there an established algorithm for finding redundant edges in a graph?
For example, I'd like to find that a->d and a->e are redundant, and then get rid of them, like this:
=>
Edit: Strilanc was nice enough to read my mind for me. "Redundant" was too strong of a word, since in the example above, neither a->b or a->c is considered redundant, but a->d is.
You want to compute the smallest graph which maintains vertex reachability.
This is called the transitive reduction of a graph. The wikipedia article should get you started down the right road.
Since the Wikipedia article mentioned by #Craig gives only a hit for an implementation, I post my implementation with Java 8 streams:
Map<String, Set<String>> reduction = usages.entrySet().stream()
.collect(toMap(
Entry::getKey,
(Entry<String, Set<String>> entry) -> {
String start = entry.getKey();
Set<String> neighbours = entry.getValue();
Set<String> visited = new HashSet<>();
Queue<String> queue = new LinkedList<>(neighbours);
while (!queue.isEmpty()) {
String node = queue.remove();
usages.getOrDefault(node, emptySet()).forEach(next -> {
if (next.equals(start)) {
throw new RuntimeException("Cycle detected!");
}
if (visited.add(next)) {
queue.add(next);
}
});
}
return neighbours.stream()
.filter(s -> !visited.contains(s))
.collect(toSet());
}
));
Several ways to attack this, but first you're going to need to define the problem a little more precisely. First, the graph you have here is acyclic and directed: will this always be true?
Next, you need to define what you mean by a "redundant edge". In this case, you start with a graph which has two paths a->c: one via b and one direct one. From this I infer that by "redundant" you mean something like this. Let G=< V, E > be a graph, with V the set of vertices and E ⊆ V×V the set of edges. It kinda looks like you're defining all edges from vi to vj shorter than the longest edge as "redundant". So the easiest thing would be to use depth first search, enumerate the paths, and when you find a new one that's longer, save it as the best candidate.
I can't imagine what you want it for, though. Can you tell?
I think the easiest way to do that, actually imagine how it would look in the real work, imagine if you have joints, Like
(A->B)(B->C)(A->C), imagine if distance between near graphs is equals 1, so
(A->B) = 1, (B->C) = 1, (A->C) = 2.
So you can remove joint (A->C).
In other words, minimize.
This is just my idea how I would think about it at start. There are various articles and sources on the net, you can look at them and go deeper.
Resources, that Will help you:
Algorithm for Removing Redundant Edges in the Dual Graph of a Non-Binary CSP
Graph Data Structure and Basic Graph Algorithms
Google Books, On finding minimal two connected Subgraphs
Graph Reduction
Redundant trees for preplanned recovery in arbitraryvertex-redundant or edge-redundant graphs
I had a similar problem and ended up solving it this way:
My data structure is made of dependends dictionary, from a node id to a list of nodes that depend on it (ie. its followers in the DAG). Note it works only for a DAG - that is directed, acyclic graph.
I haven't calculated the exact complexity of it, but it swallowed my graph of several thousands in a split second.
_transitive_closure_cache = {}
def transitive_closure(self, node_id):
"""returns a set of all the nodes (ids) reachable from given node(_id)"""
global _transitive_closure_cache
if node_id in _transitive_closure_cache:
return _transitive_closure_cache[node_id]
c = set(d.id for d in dependents[node_id])
for d in dependents[node_id]:
c.update(transitive_closure(d.id)) # for the non-pythonists - update is update self to Union result
_transitive_closure_cache[node_id] = c
return c
def can_reduce(self, source_id, dest_id):
"""returns True if the edge (source_id, dest_id) is redundant (can reach from source_id to dest_id without it)"""
for d in dependents[source_id]:
if d.id == dest_id:
continue
if dest_id in transitive_closure(d.id):
return True # the dest node can be reached in a less direct path, then this link is redundant
return False
# Reduce redundant edges:
for node in nodes:
dependents[node.id] = [d for d in dependents[node.id] if not can_reduce(node.id, d.id)]