In a given graph G=(V,E) each edge has a cost c(e). We have a starting node s and a target node t. How can we find the most expensive path from s to t using following DFS algorithm?
DFS(G,s):
foreach v in V do
color[v] <- white; parent[v] <- nil
DFS-Visit(s)
DFS-Visit(u)
color[u] <- grey
foreach v in Adj[u] do
if color[v] = white then
parent[v] = u; DFS-Visit(v)
color[u] <- black
What I have tried:
So first we create an array to maintain the cost to each node:
DFS(G,s,t):
foreach v in V do
color[v] <- white; parent[v] <- nil; cost[v] <- -inf
DFS-Visit(s,t)
return cost[t]
Second we should still visit a node event if it is gray to update its cost:
DFS-Visit(u,t)
color[u] <- grey
foreach v in Adj[u] do
if color[v] != black then
parent[v] = u;
if cost[u] < cost[v] + c(u,v) then
cost[v] = cost[v] + c(u,v)
if t != v then
DFS-Visit(v)
color[u] <- black
and we don't want to go pass t. What do you think? Is my approach correct?
Unfortunately this problem is NP-Complete. Proof is by a simple reduction of Longest Path Problem https://en.wikipedia.org/wiki/Longest_path_problem to this.
Proof:
If suppose we had an algorithm that could solve your problem in polynomial time. That is find longest path between two nodes s & t. We could then apply this algorithm for each pair of nodes (O(n^2) times) and obtain a solution for the Longest Path Problem in polynomial time.
If a simple but highly inefficient algorithm suffices, then you can adapt the DFS algorithm so that at each node, you conduct DFS of its adjacent nodes in all permuted orders. Keep track of the minimum cost obtained for each order.
Related
I wonder what will be the complexity of this algorithm of mine and why, used to check whether a graph (given in the form of neighbors list) is bipartite or not using DFS.
The algorithm works as following:
We will use edges classification, and look for back edges.
If we found one, it means there is a circle in the graph.
We will now check whether the cycle is odd cycle or not, using the the pi attribute added to each vertex, counting the number of edges participating in the cycle.
If the cycle is an odd one, return false. Else, continue the process.
Initially I thought the complexity will be O(|V| + |E|) as |V| stands for the number of vertices in the graph, and |E| stands for the number of edges in the graph, but I am afraid it might take O(|V| + |E|^2), and I wonder which option is correct and why (it may not be any of the above as well). Amortized or expected run times may also be different, and I wonder how can I check them as well.
pseudo code
DFS(G=(V,E))
// π[u] – Parent of u in the DFS tree
1 for each vertex u ∈ V {
2 color[u] ← WHITE
3 π[u]← NULL
4 time ← 0}
5 for each vertex u ∈ V {
6 if color[u] = WHITE
7 DFS-VISIT(u)}
and for the DFS-Visit:
DFS-Visit(u)
// white vertex u has just been discovered
1 color[u] ← GRAY
2 time ← time+1
3 d[u] ← time
4 for each v ∈ Adj[u] { // going over all edges {u, v}
5 if color[v] = WHITE {
6 π[v] ← u
7 DFS-VISIT(v) }
8 else if color[v] = GRAY // there is a cycle in the graph
9 CheckIfOddCycle (u, v);
10 color[u] ← BLACK
// change the color of vertex u to black as we finished going over it
11 f[u] ← time ← time+1
and as for deciding what type of cycle is it:
CheckIfOddCycle(u, v)
1 int count ← 1;
2 vertex p = u;
3 while (p! = v) {
4 p ← π[p]
5 count++ }
6 if count is an odd number {
7 S.O.P (“The graph is not bipartite!”);
8 stop the search, as the result is now concluded!
Thanks!
To determine whether or not a graph is bipartite, do a DFS or BFS that covers all the edges in the entire graph, and:
When you start on a new vertex that is disconnected from all previous vertices, color it blue;
When you discover a new vertex connected to a blue vertex, color it red;
When you discover a new vertex connected to a red vertex, color it blue;
When you find an edge to a previously discovered vertex, return FALSE if it connects blue to blue or red to red.
If you make it through the entire graph, return TRUE.
This algorithm takes very little work on top of the BFS or DFS, and is therefore O(|V|+|E|).
This algorithm is also essentially the same as the algorithm in your question. When we discover a back-edge with the same color on both sides, it means that the cycle(s) we just discovered are of odd length.
But really this algorithm has nothing to do with cycles. A graph can have a lot more cycles than it has vertices or edges, and a DFS or BFS will not necessarily find them all, so it wouldn't be accurate to say that we are searching for odd cycles.
Instead we are just trying to make a bipartite partition and returning whether or not it's possible to do so.
Let G=(V,E) be an undirected graph. How can we count cycles of length 3 exactly once using following DFS:
DFS(G,s):
foreach v in V do
color[v] <- white; p[v] <- nil
DFS-Visit(s)
DFS-Visit(u)
color[u] <- grey
foreach v in Adj[u] do
if color[v] = white then
p[v] = u; DFS-Visit(v)
color[u] <- black
There is a cycle whenever we discover a node that already has been discovered (grey). The edge to that node is called back edge. The cycle has length 3 when p[p[p[v]]] = v, right? So
DFS-Visit(u)
color[u] <- grey
foreach v in Adj[u] do
if color[v] = grey and p[p[p[v]]] = v then
// we got a cycle of length 3
else if color[v] = white then
p[v] = u; DFS-Visit(v)
color[u] <- black
However how can I create a proper counter to count the number of cycles and how can I count each cycle only once?
I'm not sure to understand how your condition parent[parent[parent[v]]] == v works. IMO it should never be true as long as parent represents a structure of tree (because it should correspond to the spanning tree associated with the DFS).
Directed graphs
Back edges, cross edges and forward edges can all "discover" new cycles. For example:
We separate the following possibilities (let's say you reach a u -> v edge):
Back edge: u and v belongs to the same 3-cycle iff parent[parent[u]] = v.
Cross edge: u and v belongs to the same 3-cycle iff parent[u] = parent[v].
Forward edge: u and v belongs to the same 3-cycle iff parent[parent[v]] = u.
Undirected graphs
There are no more cross edges. Back edges and forward edges are redundant. Therefore you only have to check back edges: when you reach a u -> v back edge, u and v belongs to the same 3-cycle iff parent[parent[u]] = v.
def dfs(u):
color[u] = GREY
for v in adj[u]:
# Back edge
if color[v] == GREY:
if parent[parent[u]] == v:
print("({}, {}, {})".format(v + 1, parent[u] + 1, u + 1))
# v unseen
elif color[v] == WHITE:
parent[v] = u
dfs(v)
color[u] = BLACK
If you want to test it:
WHITE, GREY, BLACK = 0, 1, 2
nb_nodes, nb_edges = map(int, input().split())
adj = [[] for _ in range(nb_nodes)]
for _ in range(nb_edges):
u, v = map(int, input().split())
adj[u - 1].append(v - 1)
adj[v - 1].append(u - 1)
parent = [None] * nb_nodes
color = [WHITE] * nb_nodes
If a solution without using DFS is okay, there is an easy solution which runs in O(NMlog(N³)) where N is the number of vertices in the graph and M is the number of edges.
We are going to iterate over edges instead of iterating over vertices. For every edge u-v, we have to find every vertex which is connected to both u and v. We can do this by iterating over every vertex w in the graph and checking if there is an edge v-w and w-u. Whenever you find such vertex, order u,v,w and add the ordered triplet to a BBST that doesn't allow repetitions (eg: std::set in C++). The count of length 3 cycles will be exactly the size of the BBST (amount of elements added) after you check every edge in the graph.
Let's analyze the complexity of the algorithm:
We iterate over every edge. Current complexity is O(M)
For each edge, we iterave over every vertex. Current complexity is O(NM)
For each (edge,vertex) pair that forms a cycle, we are going to add a triplet to a BBST. Adding to a BBST has O(log(K)) complexity where K is the size of the BST. In worst case, every triplet of vertices forms a cycle, so we may add up to O(N³) elements to the BST, and the complexity to add some element can get as high as O(log(N³)). Final complexity is O(NMlog(N³)) then. This may sound like a lot, but in worst case M = O(N²) so the complexity will be O(N³log(N³)). Since we may have up to O(N³) cycles of length 3, our algorithm is just a log factor away from an optimal algorithm.
Below is the general code for DFS with logic for marking back edges and tree edges. My doubt is that back edges from a vertex go back and point to an ancestor and those which point to the parent are not back edges (Lets assume undirected graph).
In an undirected graph we have an edge back and forth between 2 vertices x and y. So after visiting x when I process y, y has x as an adjacent vertex but as it's already visited, the code will mark it as a back edge.
Am I right in saying that? Should we add any extra logic to avoid this in case my assumption is valid?
DFS(G)
for v in vertices[G] do
color[v] = white
parent[v]= nil
time = 0
for v in vertices[G] do
if color[v] = white then
DFS-Visit(v)
Induce a depth-rst tree on a graph starting at v.
DFS-Visit(v)
color[v]=gray
time=time + 1
discovery[v]=time
for a in Adj[v] do
if color[a] = white then
parent[a] = v
DFS-Visit(a)
v->a is a tree edge
elseif color[a] = grey then
v->a is a back edge
color[v] = black
time = time + 1
white means unexplored, gray means frontier, black means processed
Yes, this implementation determines frontier nodes only by color (visited, not visited) and, thus, doesn't separate parent and ancestor nodes. So, each edge in DFS search tree will be marked as back edge.
In order to separate tree and back edges you need to separate edges to parent and ancestor nodes. Simple way is to provide parent node as a parameter to DFS-Visit (p). For example:
DFS-Visit(v, p)
color[v]=gray
time=time + 1
discovery[v]=time
for a in Adj[v] do
if color[a] = white then
parent[a] = v
DFS-Visit(a,v)
v->a is a tree edge
elseif color[a] = grey and (a is not p) then
v->a is a back edge
color[v] = black
time = time + 1
UPDATE: I haven't noticed you already store parent nodes. So, there is no need to introduce parameter:
DFS-Visit(v)
color[v]=gray
time=time + 1
discovery[v]=time
for a in Adj[v] do
if color[a] = white then
parent[a] = v
DFS-Visit(a)
v->a is a tree edge
elseif color[a] = grey and (a is not parent[v]) then
v->a is a back edge
color[v] = black
time = time + 1
(This is derived from a recently completed programming contest)
You are given G, a connected graph with N nodes and N-1 edges.
(Notice that this implies G forms a tree.)
Each edge of G is directed. (not necessarily upward to any root)
For each vertex v of G it is possible to invert zero or more edges such that there is a directed path from every other vertex w to v. Let the minimum possible number of edge inversions to achieve this be f(v).
By what linear or loglinear algorithm can we determine the subset of vertexes that have the minimal overall f(v) (including the value of f(v) of those vertexes)?
For example consider the 4 vertex graph with these edges:
A<--B
C<--B
D<--B
The value of f(A) = 2, f(B) = 3, f(C) = 2 and f(D) = 2...
..so therefore the desired output is {A,C,D} and 2
(note we only need to calculate the f(v) of vertexes that have a minimal f(v) - not all of them)
Code:
For posterity here is the code of solution:
int main()
{
struct Edge
{
bool fwd;
int dest;
};
int n;
cin >> n;
vector<vector<Edge>> V(n+1);
rep(i, n-1)
{
int src, dest;
scanf("%d %d", &src, &dest);
V[src].push_back(Edge{true, dest});
V[dest].push_back(Edge{false, src});
}
vector<int> F(n+1, -1);
vector<bool> done(n+1, false);
vector<int> todo;
todo.push_back(1);
done[1] = true;
F[1] = 0;
while (!todo.empty())
{
int next = todo.back();
todo.pop_back();
for (Edge e : V[next])
{
if (done[e.dest])
continue;
if (!e.fwd)
F[1]++;
done[e.dest] = true;
todo.push_back(e.dest);
}
}
todo.push_back(1);
while (!todo.empty())
{
int next = todo.back();
todo.pop_back();
for (Edge e : V[next])
{
if (F[e.dest] != -1)
continue;
if (e.fwd)
F[e.dest] = F[next] + 1;
else
F[e.dest] = F[next] - 1;
todo.push_back(e.dest);
}
}
int minf = INT_MAX;
rep(i,1,n)
chmin(minf, F[i]);
cout << minf << endl;
rep(i,1,n)
if (F[i] == minf)
cout << i << " ";
cout << endl;
}
I think that the following algorithm works correctly, and it certainly works in linear time.
The motivation for this algorithm is the following. Let's suppose that you already know the value of f(v) for some single node v. Now, consider any node u adjacent to v. If we want to compute the value of f(u), we can reuse some of the information from f(v) in order to compute it. Note that in order to get from any node w in the graph to u, one of two cases must happen:
That path passes through the edge connecting u and v. In that case, the way that we get from w to u is to go from w to v, then to follow the edge from v to u.
That path does not pass through the edge connecting u and v. In that case, the way that we get from w to u is the exact same way that we got from w to v, except that we stop as soon as we get to u.
The reason that this observation is important is that it means that if we know the number of edges we'd flip to get from any node to v, we can easily modify it to get the set of edges that we'd flip to get from any node to u. Specifically, it's going to be the same set of edges as before, except that we want to direct the edge connecting u and v so that it connects v to u rather than the other way around.
If the edge from u to v is initially directed (u, v), then we have to flip all the normal edges we flipped to get every node pointing at v, plus one more edge to get v pointed back at u. Thus f(u) = f(v) + 1. Otherwise, if the edge is originally directed (v, u), then the set of edges that we'd flip would be the same as before (pointing everything at v), except that we wouldn't flip the edge (v, u). Thus f(u) = f(v) - 1.
Consequently, once we know the value of f for a single node v, we can compute it for each adjacent node u as follows:
f(u) = f(v) + 1 if (u, v) is an edge.
f(u) = f(v) - 1 otherwise
This means that we can compute f(v) for all nodes v as follows:
Compute f(v) for some initial node v, chosen arbitrarily.
Do a DFS starting from v. When reaching a node u, compute its f score using the above logic.
All that's left to do is to compute f(v) for some initial node. To do this, we can run a DFS from v outward. Every time we see an edge pointed the wrong way, we have to flip it. Thus the initial value of f(v) is given by the number of wrong-pointing edges we find during the initial DFS.
We thus can compute the f score for each node in O(n) time by doing an initial DFS to compute f(v) for the initial node, then a secondary DFS to compute f(u) for each other node u. You can then for-loop over each of the n f-scores to find the minimum score, then do one more loop to find all values with that f-score. Each of these steps takes O(n) time, so the overall algorithm takes O(n) time as well.
Hope this helps! This was an awesome problem!
I have the following BFS function from Cormen.
Definition of the shortest-path distance path(s,v) from s to v as the minimum number of edges in any path from vertex s to vertex v, or else if there is no path from s to v. A path of length path(s,v) from s to v is said to be a shortest path from s to v.
Following is lemma given
Let G = (V,E) be a directed or undirected graph, and let s belongs to V be an arbitrary vertex. Then, for any edge (u, v) E,
path(s,v) <= path(s,u) + 1 .
My question is why we have to have <= in above formula, i taught "=" is ok, can any one tell me one scenrio why we require <= ?
Below is BFS algorithm
Leema 2:
Let G = (V,E) be a directed or undirected graph, and suppose that BFS is run on G from a given source vertex s belongs to V. Then upon termination, for each vertex v belongs to V, the value d[v] computed by BFS satisfies d[v] >= path (s, v).
Proof:
We use induction on the number of times a vertex is placed in the queue Q. Our inductive hypothesis is that d[v] >= path(s,v) for all v belongs to V.
The basis of the induction is the situation immediately after s is placed in Q in line 8 of BFS.
The inductive hypothesis holds here, because d[s] = 0 = path(s, s) and d[v] = path (s, v) for all v belongs to V - {s}.
My question is what does author mean by "We use induction on the number of times a vertex is placed in the queue Q" ? and how it is related to inductive hypothesis?
Thanks!
BFS(G,s)
1 for each vertex u V[G] - {s}
2 do color[u] WHITE
3 d[u]
4 [u] NIL
5 color[s] GRAY
6 d[s] 0
7 [s] NIL
8 Q {s}
9 while Q
10 do u head[Q]
11 for each v Adj[u]
12 do if color[v] = WHITE
13 then color[v] GRAY
14 d[v] d[u] + 1
15 [v] u
16 ENQUEUE(Q,v)
17 DEQUEUE(Q)
18 color[u] BLACK
For your first question, consider a complete graph with only three vertices. In this graph is it true that path(s,v) = path(s,u) + 1 ?