I currently work on a problem where I want to try to find an algorithm which does the following: Given a square grid graph G and start node S and an end node E, where E and S in G, find a path P from S to E with maximum value and |P| <= k. If it makes it easier, one can possibly make G a DAG.
The grid cells are either 0 or 1.
As an example:
S--o--o--o
| : | |
o--o..o..o
: | : |
o--o--E--o
| : | |
o--o--o--o
S := "Starting State"
E := "Ending State"
- := "Edge value is 1"
. := "Edge value is 0"
Solution with k = 5 (from what I see)
S o o o
|
o--o o o
|
o o--E o
o o o o
S and E lie arbitrarily, so one cannot assume just down and right movement, but I can transform the graph to a DAG with some loss to optimality I assume.
Edge value is a cost, G is a grid graph where every node is connected to its four neighbours.
First of all, is this problem already known in literature? I did not find anything about it. Is it in NP or does someone has an idea for a fast algorithm? I asked the search engine of my choice, and somebody asked something maybe related to it on StackOverflow , but their problem description does not match 100%, since their goal is last row, where mine is a distinct node.
Aight, a warning first: I've thought this up on the spur of the moment. I seem to remember reading about something like it before, but I can't remember where, so while it seems correct I can't be sure of it. If I spot a flaw later, I'll come back and edit this post and notify you.
Let L(k, v) be the value of the path of length at most k from S to some node v, and suppose v has predecessors {u1, u2, ... um}. Since G is a DAG, it must be that
L(k, v) = max { L(k-1, u1) + w(u1, v), L(k-1, u2) + w(u2, v), ..., L(k-1, um) + w(um, v) }
where w(u,v) is the weight of the edge from u to v.
To put this to use, what we're going to do is find the highest value path of length < R to every node within a radius of R of S. That then gives us enough information to calculate the highest-value path of length < R+1 to every node within a radius R+1 of S. So:
First, throw away any node that is more than distance k from S, as it can't possibly be part of the optimum path. We have O(k^2) nodes remaining.
Now initialize a collection L and set L[S] = 0. Leave all other entries undefined.
Next, apply the L[v] rule to each node in the graph (ignore the k parameters)
If a predecessor u of a node v doesn't have a value for L[u] defined yet, ignore u when calculating L[v].
If no predecessor u of v has L[u] defined, leave L[v] undefined.
Repeat Step 3 k-1 more times.
If L[E] has a value, return it. Else there is no length k path from S to E.
This is O(k^3) time. You could probably speed it up for large graphs by only considering nodes both within distance 1 of S and distance k-1 of E during the first execution of Step 3, and only nodes within distance 2 of S and distance k-2 of E during the second execution, and so-on, but that'll still be cubic time.
Related
Disclaimer #1: I'm not a pro, so many of my nomenclatures might be not standard or useful. Please bear with me / edit me.
Disclaimer #2: As the tags suggest, this may start out as a theoretical question, but I think it's a programming one, though some theory would also be nice.
First, let me describe this type of sorted weighted trees, now called SWR trees. Let T = (V, E, W, U, m, r) be an SWR tree. The only defining properties of T are:
T is a m-ary rooted tree with root r, and every leaf has the same height/level in T
T has predefined and unchanged weights on edges, defined by the function W: E -> R+ (R+ is the set of positive real numbers)
T has predefined and unchanged weights on leaves, defined by the function U: V_L -> R+ (V_L is the set of leaves in V)
For each non-leaf node v of T, its children are sorted in the increasing values of the edges connecting them to v
Now, let me describe the function on T, now called F(T). F will produce a number on T as follows:
Extend the function U to U*: V -> R+ as follows: for each non-leaf node v, assign to v the largest value of the child edges of v (the edges connecting v to its children)
For each height/level h of T, calculate f(h) as the minimum value of the vertices (defined by U*) at that height/level
Sum all of the f(h) to get F(T)
Also, let me describe the proper pruning process on T. Consider the pruning of the edges. When an edge is pruned, its sub-tree is removed. Not only that, all of its larger edges (and their sub-trees) are also removed (keep in mind, due to the sorting, only consider the larger sibling edges). Hence, the remaining tree T' is still an SWR tree and properly inherits all properties from T. Obviously, F(T') has changed (even U* and f have changed).
Therefore, the problem arises. Given an SWR tree T, how can one properly prune it to get an SWR tree T' with the maximum value of F ?
Disclaimer #3: I'm aware of the fact that the problem is like fallen from the sky and rather messy. Please feel free to reformulate it as you like. Also, just to formulate the problem itself exhausts me a bit, so I have had no handle to solve this yet.
Let's first simplify your problem definition slightly by removing the leaf weights. Now that none of the weights are negative, we can put a single child under each leaf and move each leaf's weight to its new child edge.
I can write down what seems like a pretty tight integer program that captures this problem. For each edge e, the variable x[e] is 1 if we keep the edge, 0 otherwise. The variable y[e] is 1 if e is the minimum value of the maximum sibling on its level, 0 otherwise.
maximize sum_{e} W(e) y[e]
subject to
for all e, x[e] ∈ {0, 1}
for all e, y[e] ∈ {0, 1}
for all e sibling of e' with W(e) ≤ W(e'), x[e'] − x[e] ≤ 0
for all e parent of e', x[e'] − x[e] ≤ 0
for all levels ℓ, for all e at level ℓ, for all p at level ℓ−1, y[e] + x[p] − sum_{e' child of p with W(e) ≤ W(e')} x[e] ≤ 1
for all levels ℓ, sum_{e at level ℓ} y[e] = 1
The first two constraint groups enforce the restrictions on pruning. The next constraint group says, essentially, an edge cannot be the minimum value of the maximum sibling on its level unless each sibling group on its level has an edge at least as valuable or is totally gone. The final constraint is only needed to break ties.
This formulation can be solved as is with an integer program solver, but I strongly suspect that there's a more efficient algorithm.
There is a directed graph (which might contain cycles), and each node has a value on it, how could we get the sum of reachable value for each node. For example, in the following graph:
the reachable sum for node 1 is: 2 + 3 + 4 + 5 + 6 + 7 = 27
the reachable sum for node 2 is: 4 + 5 + 6 + 7 = 22
.....
My solution: To get the sum for all nodes, I think the time complexity is O(n + m), the n is the number of nodes, and m stands for the number of edges. DFS should be used,for each node we should use a method recursively to find its sub node, and save the sum of sub node when finishing the calculation for it, so that in the future we don't need to calculate it again. A set is needed to be created for each node to avoid endless calculation caused by loop.
Does it work? I don't think it is elegant enough, especially many sets have to be created. Is there any better solution? Thanks.
This can be done by first finding Strongly Connected Components (SCC), which can be done in O(|V|+|E|). Then, build a new graph, G', for the SCCs (each SCC is a node in the graph), where each node has value which is the sum of the nodes in that SCC.
Formally,
G' = (V',E')
Where V' = {U1, U2, ..., Uk | U_i is a SCC of the graph G}
E' = {(U_i,U_j) | there is node u_i in U_i and u_j in U_j such that (u_i,u_j) is in E }
Then, this graph (G') is a DAG, and the question becomes simpler, and seems to be a variant of question linked in comments.
EDIT previous answer (striked out) is a mistake from this point, editing with a new answer. Sorry about that.
Now, a DFS can be used from each node to find the sum of values:
DFS(v):
if v.visited:
return 0
if v is leaf:
return v.value
v.visited = true
return sum([DFS(u) for u in v.children])
This is O(V^2 + VE) worst vase, but since the graph has less nodes, V
and E are now significantly lower.
Some local optimizations can be made, for example, if a node has a single child, you can reuse the pre-calculated value and not apply DFS on the child again, since there is no fear of counting twice in this case.
A DP solution for this problem (DAG) can be:
D[i] = value(i) + sum {D[j] | (i,j) is an edge in G' }
This can be calculated in linear time (after topological sort of the DAG).
Pseudo code:
Find SCCs
Build G'
Topological sort G'
Find D[i] for each node in G'
apply value for all node u_i in U_i, for each U_i.
Total time is O(|V|+|E|).
You can use DFS or BFS algorithms for solving Your problem.
Both have complexity O(V + E)
You dont have to count all values for all nodes. And you dont need recursion.
Just make something like this.
Typically DFS looks like this.
unmark all vertices
choose some starting vertex x
mark x
list L = x
while L nonempty
choose some vertex v from front of list
visit v
for each unmarked neighbor w
mark w
add it to end of list
In Your case You have to add some lines
unmark all vertices
choose some starting vertex x
mark x
list L = x
float sum = 0
while L nonempty
choose some vertex v from front of list
visit v
sum += v->value
for each unmarked neighbor w
mark w
add it to end of list
I have a directed graph, that looks like this:
I want to find the cheapest path from Start to End where the orange dotted lines are all required for the path to be valid.
The natural shortest path would be: Start -> A -> B -> End with the resultant cost = 5, but we have not met all required edge visits.
The path I want to find (via a general solution) is Start -> A -> B -> C -> D -> B -> End where the cost = 7 and we have met all required edge visits.
Does anyone have any thoughts on how to require such edge traversals?
Let R be the set of required edges and F = |R|. Let G be the input graph, t (resp. s) the starting (resp. ending) point of the requested path.
Preprocessing: A bunch of Dijkstra's algorithm runs...
The first step is to create another graph. This graph will have exactly F+2 vertices:
One for each edge in R
One for the starting point t of the path you want to compute
One for the ending point s of the path you want to compute
To create this graph, you will have to do the following:
Remove every edge in R from G.
For each edge E = (b,e) in R:
Compute the shortest path from t to b and the shortest path from e to s. If they exist, add an edge linking s to E in the "new graph", weighing the length of the related shortest path.
For each edge E' = (b', e') in R \ {E}:
Compute the shortest path from e to b'. If it exists, add an edge from E to E' in the new graph, weighing the length of that shortest path. Attach the computed paths as payload to the relevent edges.
Attach the computed path as a payload to that edge
The complexity to build this graph is O((F+2)².(E+V).log(V)) where E (resp. V) is the number of edges (resp. vertices) in the original graph.
Exhaustive search for the best possible path
From this point, we have to find the shortest Hamiltonian Path in the newly created graph. Unfortunately, this task is a hard problem. We have no better way than exploring every possible path. But that doesn't mean we can't do it cleverly.
We will perform the search using backtracking. We can achieve this by maintaining two sets:
The list of currently explored vertices: K (K for Known)
The list of currently unknown vertices: U (U for Uknown)
Before digging in the algorithm definition, here are the main ideas. We cannot do anything else than exploring the whole space of possible paths in the new graph. At each step, we have to make a decision: which edge do we take next? This leads to a sequence of decisions until we cannot move anymore or we reached s. But now we need to go back and cancel decisions to see if we can do better by changing a direction. To cancel decisions we proceed like this:
Every time we are stuck (or found a path), we cancel the last decision we made
Each time we take a decision at some point, we keep track of which decision, so when we get back to this point, we know not to take that very same decision and explore the others that are available.
We can be stuck because:
We found a path.
We cannot move further (there is no edge we can explore or the only one we could take increases the current partial path too much -- its length becomes higher than the length of the current best path found).
The final algorithm can be summed up in this fashion: (I give an iterative implementation, one can find a recursive implementation a tad easier and clearer)
Let K ← [], L[0..R+1] ← [] and U ← V (where V is the set of every vertex in the working graph minus the starting and ending vertices t and s). Finally let l ← i ← 0 and best_path_length ← ∞ and best_path ← []
While (i ≥ 0):
While U ≠ []
c ← U.popFront() (we take the head of U)
L[i].pushBack(c)
If i == R+1 AND (l == weight(cur_path.back(), s) + l) < best_path_length:
best_path_length ← l
best_path ← cur_path
If there is an edge e between K.tail() and c, and weight(e) + l < best_path_length: (if K is empty, then replace K.tail() with t in the previous statement)
K.pushBack(c)
i ← i+1
l ← weight(e) + l
cur_path.pushBack(c)
Concatenate L[i] at the end of U
L[i] ← []
i ← i-1
cur_path.popBack()
At the end of the while loop (while (i ≥ 0)), best_path will hold the best path (in the new graph). From there you just have to get the edges' payload to rebuild the path in the original graph.
For each node u in an undirected graph, let twodegree[u] be the sum of the degrees of u's neighbors. Show how to compute the entire array of twodegree[.] values in linear time, given a graph in adjacency list format.
This is the solution
for all u ∈ V :
degree[u] = 0
for all (u; w) ∈ E:
degree[u] = degree[u] + 1
for all u ∈ V :
twodegree[u] = 0
for all (u; w) ∈ E:
twodegree[u] = twodegree[u] + degree[w]
can someone explain what degree[u] does in this case and how twodegree[u] = twodegree[u] + degree[w] is supposed to be the sum of the degrees of u's neighbors?
Here, degree[u] is the degree of the node u (that is, the number of nodes adjacent to it). You can see this computed by the first loop, which iterates over all edges in the graph and increments degree[u] for each edge in the graph.
The second loop then iterates over every node in the graph and computes the sum of all its neighbors' degrees. It uses the fact that degree[u] is precomputed in order to run in O(m + n) time.
Hope this helps!
In addition to what #templatetypedef has said, the statement twodegree[u] = twodegree[u] + degree[w] simply keeps track of the twodegree of u while it keeps iteratively (or cumulatively) adding the degrees of its neighbors (that are temporarily stored in w)
I have an undirected graph. For now, assume that the graph is complete. Each node has a certain value associated with it. All edges have a positive weight.
I want to find a path between any 2 given nodes such that the sum of the values associated with the path nodes is maximum while at the same time the path length is within a given threshold value.
The solution should be "global", meaning that the path obtained should be optimal among all possible paths. I tried a linear programming approach but am not able to formulate it correctly.
Any suggestions or a different method of solving would be of great help.
Thanks!
If you looking for an algorithm in general graph, your problem is NP-Complete, Assume path length threshold is n-1, and each vertex has value 1, If you find the solution for your problem, you can say given graph has Hamiltonian path or not. In fact If your maximized vertex size path has value n, then you have a Hamiltonian path. I think you can use something like Held-Karp relaxation, for finding good solution.
This might not be perfect, but if the threshold value (T) is small enough, there's a simple algorithm that runs in O(n^3 T^2). It's a small modification of Floyd-Warshall.
d = int array with size n x n x (T + 1)
initialize all d[i][j][k] to -infty
for i in nodes:
d[i][i][0] = value[i]
for e:(u, v) in edges:
d[u][v][w(e)] = value[u] + value[v]
for t in 1 .. T
for k in nodes:
for t' in 1..t-1:
for i in nodes:
for j in nodes:
d[i][j][t] = max(d[i][j][t],
d[i][k][t'] + d[k][j][t-t'] - value[k])
The result is the pair (i, j) with the maximum d[i][j][t] for all t in 0..T
EDIT: this assumes that the paths are allowed to be not simple, they can contain cycles.
EDIT2: This also assumes that if a node appears more than once in a path, it will be counted more than once. This is apparently not what OP wanted!
Integer program (this may be a good idea or maybe not):
For each vertex v, let xv be 1 if vertex v is visited and 0 otherwise. For each arc a, let ya be the number of times arc a is used. Let s be the source and t be the destination. The objective is
maximize ∑v value(v) xv .
The constraints are
∑a value(a) ya ≤ threshold
∀v, ∑a has head v ya - ∑a has tail v ya = {-1 if v = s; 1 if v = t; 0 otherwise (conserve flow)
∀v ≠ x, xv ≤ ∑a has head v ya (must enter a vertex to visit)
∀v, xv ≤ 1 (visit each vertex at most once)
∀v ∉ {s, t}, ∀cuts S that separate vertex v from {s, t}, xv ≤ ∑a such that tail(a) ∉ S ∧ head(a) ∈ S ya (benefit only from vertices not on isolated loops).
To solve, do branch and bound with the relaxation values. Unfortunately, the last group of constraints are exponential in number, so when you're solving the relaxed dual, you'll need to generate columns. Typically for connectivity problems, this means using a min-cut algorithm repeatedly to find a cut worth enforcing. Good luck!
If you just add the weight of a node to the weights of its outgoing edges you can forget about the node weights. Then you can use any of the standard algorigthms for the shortest path problem.