mark x as visited
list L = x
tree T = x
while L nonempty
choose some vertex v from front of list
process v
for each unmarked neighbor w
mark w as visited
add it to end of list
add edge vw to T
Most of the code will choose to mark the adjacent node as visited before visiting them. Won't it technically be correct to add all neighbor first and visit them later?
list L = x
tree T = x
while L nonempty
choose some vertex v from front of list
if (V NOT YET VISITD)
MARK V AS VISITED HERE
for each unmarked neighbor w
add it to end of list
add edge vw to T
Why is it that every BFS seems to mark node as visited when you did not even visit them yet? I am trying to find a theoretically correct code for BFS. Which one is correct?
Both algorithms work, but the second version might add the same node to the list L twice. This doesn't affect correctness because of the additional check whether a node was visited, but it increases memory consumption and requires an extra check. That's why you'll typically see the first algorithm in text books.
Both are correct, but they use different definitions of the word visited. It is common for algorithms to have many variations and have many different implementations that are all correct, and BFS is one example.
Related
Dijkstra's algorithm in CLRS, p.595 has the following code in line 7:
for each vertex v \in Adj[u]
This line picks to iterate on all neighbors of node v. Node v here is the one the algorithm is currently processing and adding to the shortest path tree.
However, among those neighbors of v, those already in set S are processed on & done with, and those
nodes in S are forming a shortest path tree T.
None of the nodes in set S can have a path-thru-v that is shorter than a path already in T.
Otherwise, that path would have been traversed till then.
So, shouldn't this line 7 be better as
for each vertex v \in Adj[u] \intersect Q // Q = V \ S
or, equally,
for each vertex v \in Adj[u]\S
?
//===========================
ADDING explanations:
once you processed (processed=set the distance and parent vector
entries of all its immediate neighbors) a node u and added it to the tree,
that node u is at shortest distance from the source. if there were an off-tree node
z so that a shorter path to u would exist thru it between the source & u, that node z would be processed before u.
//======================
ADDITION 2: lengthy comment to Javier's useful answer below:
Put all edges in the graph in an array say "EDGES"-- one edge, one array entry.
each array entry holds the edge (u,v), the edge-weight, and 2 pointers-- one to node u and one to node v.
the graph is represented still as an adjacency list.
Adj[u] is still a linked list-- however, this linked list is on an array structure.
The node values in this list, this time, is the index of EDGES corresponding to that edge.
So, for instance, Node u has 2 links incident to it:
(u,x) & (u,y). Edge (u,x) is sitting at the 23rd cell of EDGES and (u,y) at 5th.
Then, Adj[u] is a linked list of length 2, the nodes in this list are 23 and 5. Say, Adj[u][0].edgesIndex=23 and Adj[u][1].edgesIndex=5. Here, Adj[x][i].edgesIndex=23 for some i in the linked list at Adj[x] as well. (Adj[j][i], being a node in a linked list, further hast the "next" and "prev" fields on it.)
And, EDGES[23] has one reference to the corresponding entry of (u,x) on Adj[u], and another to that of Adj[x]. I leave line 7 as is, but this time, after i process an edge (u,v) in that loop, (i've found out about this edge from Adj[u]), i remove that edge from the linked list of Adj[u], from there i go to the corresponding EDGES entry, which has the reference to the corresponding Adj[x][i] entry. i remove them all-- EDGES[23], Adj[u][0] and Adj[x][i] (whatever i is there.) With all arrays-structures, i can process all these in constant time for each edge.
Still the adjacency list representation, can trace the location of (v,u) from (u,v) and remove them at constant time, and now processing only on the edges in that intersection i'm looking for in asymptotically the same amount of memory used and with more time efficiency.
//====================
ADDITION 3:
Correcting one thing in ADDITION 2 above:
what i wrote in that addition may take more-- not less time than the algorithm without it:
removing the links in the linked lists at Adj[u] and Adj[x], and the corresponding
EDGES entry, the direct-memory look-ups during all these isn't much likely
to take less CPU cycles than that of relaxing the edges in the algorithm as is.
It still checks every edge (u,v) exactly once and not twice--
once for (u,v) and once for (v,u), and clearly in the same asymptotic time as the algorithm without it. But for little gain in the absolute
processing time and with more cost on memory used.
Another alternative is:
adding a line of something like
if (v \in S) then continue;
as the first of the for loop. this can be implemented by maintaining S as
an array of S[|V|] of boolean and setting its values accordingly as each vertex is
added to set S-- which is basically what javier is saying in his ans.
Intersecting Adj[u] with Q is correct, however it's not a good idea because the in the end, you'll need to iterate over all elements of Adj[i]. I don't think there's a way to workaround that.
It would only work if you can find a way to intersect those two sets VERY efficiently, i.e., anything better than O(n).
There's a nice enhancement that you implement is to mark all the nodes that are settled, then if the node v is settled, you can ignore the rest of the inner cycle.
I have a graph with positive edge weights and positive node weights. The length of a path is defined as the sum of all the edge weights along the path, plus the maximum node weight encountered along the path.
I'd initially thought that a modified Dijkstra would work, but I found a test case where it would fail. How should I go about solving this problem? Are there any standard algorithms I should look at?
My modified Dijkstra is as follows: At each node I record the shortest path so far, and also the maximum node weight I've seen so far, and use that to calculate the length to neighboring nodes. Please see my comment for the details.
Here's a graph where Dijkstra fails:
http://i.imgur.com/FQhRzXV.jpg
The numbers in green are the node labels. Everything in blue is weights (node and edge weights). Lets say I want to compute the shortest path between nodes 1 and 7 (labeled in green). The problem with Dijkstra is that the node 4 always records the path 1-8-9-4 since its shorter than path 1-2-3-4 (former length 9 vs latter length 13). But to reach node 7, path 1-8-9-4-5-6-7 is longer than 1-2-3-4-5-6-7.
If you can forgive one order larger polynomial time, then fairly easy algorithm:
ModifiedShortestPath(u, v, G) {
X = StandardardShorestPath(u, v, G);
E = heaviest edge in X
F = all edges in G of weight >= E
Y = ModifiedShortestPath(u, v, G - F); // recur here on G without the F edges
return Min(X, Y);
}
The runtime of this is |E| times more than your standard shortest path.
Your graph is not that clear to begin with (too many values in blue of unclear role), which makes answers even more difficult. A much better question, a simpler graph and some straight answers in this post.
What made it clear for me, and allowed me to correct my implementation and get the correct results, was that at the end of each repetition in the loop, when it was time to pick the next node/vertex, whose unvisited neighbours I should examine, I had to pick from the whole pool of unvisited vertices, not just from the unvisited neighbours of the currently examined node. I was under the false impression that once you pick a path at a crossroad, because the greedy nature of the algorithm takes you there, you can only follow it to the end, unvisited after unvisited node. No. You pick the next globally unvisited node each time based on the smallest tentative value, regardless of its position in the graph or whether it is connected to the current node.
I Hope that clears the confussion that others like me have experienced and has led them here.
I need some help in writing a set of if-then rules for traversing a maze. This is the problem:
"Assume that the maze is constructed on a grid of square cells by placing walls across some of the edges of cells in such a way that there is a path from any cell within the maze to an outer edge of the maze that has no wall.
One way is left-hand rule, but this strategy can take you around in cycles.
Write if-then rules in English for traversing the wall and detecting a cycle. Assume you know size of grid and max distance you may have to travel to escape the maze."
This is what I have so far:
Start
If only one path (Left or Right or Straight) is found, follow the path.
Else If Multiple path is found:
If left path is found, take a left turn.
Else if straight path is found, follow straight path.
Else if right path is found, take a right turn.
Else If Dead End is found, take a 'U' turn.
Go To step 2
End
But this is not solving the cycle problem. Can anyone help please?
THere are two generic algorithms for exploring graphs: Breadth First Search (BFS) and Depth First Search (DFS). The trick to these algorithms is they start out with all paths in the unexplored list, and as they visit paths they add those to the explored list. As you visit each node you remove it from the unexplored list so you won't revisit it. By only pulling nodes from the unexplored list in each case you don't have a case where you would double back upon yourself.
Here are examples of DFS with checks to prevent cycles and BFS:
function DFS(G,v):
label v as explored
for all edges e in G.adjacentEdges(v) do
if edge e is unexplored then
w ← G.adjacentVertex(v,e)
if vertex w is unexplored then
label e as a discovery edge
recursively call DFS(G,w)
else
label e as a back edge
Now BFS:
procedure BFS(G,v):
create a queue Q
enqueue v onto Q
mark v
while Q is not empty:
t ← Q.dequeue()
if t is what we are looking for:
return t
for all edges e in G.adjacentEdges(t) do
u ← G.adjacentVertex(t,e)
if u is not marked:
mark u
enqueue u onto Q
return none
Given a directed graph G, what is the best way to go about finding a vertex v such that there is a path from v to every other vertex in G?
This algorithm should run in linear time. Is there an existing algorithm that solves this? If not, I'd appreciate some insight into how this can be solved in linear time (I can only think of solutions that would certainly not take linear time).
Make a list L of all vertices.
Choose one; call it V. From V, walk the graph, removing points from the list as you go, and keeping a stack of unvisited edges. When you find a loop (some vertex you visit is not on the list), pop one of the edges from the stack and proceed.
If the stack is empty, and L is not empty, then choose a new vertex from L, call it V, and proceed as before.
When L is finally empty, the V you last chose is an answer.
This can be done in linear time in the number of edges.
Find the strongly connected components.
Condense each of the components into a single node.
Do a topological sort on the condensed graph, The node with the highest rank will have a path to each of the other nodes (if the graph is connected at all).
I think I've got a correct answer.
Get the SCC.
Condense each of the components into a single node.
Check whether every pair of adjacent nodes is reachable.
This is a sufficient and necessary condition.
let there be an undirected tree T, and let there be: T.leaves - all the leaves (each v such that d(v) = 1). we know: |T.leaves| and the distance between u and v for each u,v in T.leaves.
in other words: we have an undirected tree, and we know how many leaves it has, and the distance between every 2 leaves.
we need to find how many inside vertices (d(v)>1) are in the tree.
note: building the complete tree is impossible because if we have only 2 leaves but the distance between them is 2^30, it will take too long...
I tried to start from the shortest distance and count how many vertices are between them, and then adding the vertex closest to them, but for this I need some formula f(leaves_counted,next_leaf) but I could not manage to find that f...
any ideas?
Continued from discussion in comments. This is how to check a particular (compressed) edge to see if you can attach the new vertex n somewhere in the middle of it, without iterating over the distances.
Ok, so you need to find three numbers: l (the distance of the attach point from the left node of the edge in question), x (the distance of the new node from the attach point) and r (symmetrical to l.)
Obviously, for every node y in set L (the left part of the tree), its distance to A must differ from its distance to n by the same number (lets call it dl which must be equal l + x). If this is not the case, there is no solution for this particular edge. Same goes for nodes in R, with dr and r + x respectively.
If the above holds, then you have three equations:
l + x = dl
r + x = dr
r+l = dist(A,B)
Three equations, three numbers. If this has a solution then you have found the right edge.
At worst you need to iterate the above for every edge, but I think it can be optimized - the distance check on L and R might exclude one of the parts of the tree from further search. It might also be possible to somehow get the number of nodes without even constructing the tree.
if your binary tree has L leaves then it has L-1 internal vertices regardless of the shape of the tree.
You can easily prove this: start with the tree with only one node (root) node. Then take any leaf, and add two descendants to it, converting the leaf into an internal vertex and adding to leaves. This removes one leaf (the old node), and adds one internal node and two leaves, i.e. net is +1 internal node and +1 leaf. Because you start with one leaf and 0 internal nodes, you have always |leaves| = |internal nodes|+1 --- any tree shape can be produced by this process.
Here examples of all the two shapes of trees with 4 leaves (save for trivial left-right symmetries):
o o
o L o o
o L L L L L
L L
The number of internal vertices is always 3.