I have been trying to implement the Iterative Deepening A* algorithm where I have a graph with cycles .I have looked at the pseudo code of wikipedia found below :
node current node
g the cost to reach current node
f estimated cost of the cheapest path (root..node..goal)
h(node) estimated cost of the cheapest path (node..goal)
cost(node, succ) step cost function
is_goal(node) goal test
successors(node) node expanding function, expand nodes ordered by g + h(node)
procedure ida_star(root)
bound := h(root)
loop
t := search(root, 0, bound)
if t = FOUND then return bound
if t = ∞ then return NOT_FOUND
bound := t
end loop
end procedure
function search(node, g, bound)
f := g + h(node)
if f > bound then return f
if is_goal(node) then return FOUND
min := ∞
for succ in successors(node) do
t := search(succ, g + cost(node, succ), bound)
if t = FOUND then return FOUND
if t < min then min := t
end for
return min
end function
However the problem is that this pseudocode does not deal with cycles as when entering a cycle the loop does not terminate.How can this be done ?
I recommend you create two node lists and check them for each iteration:
Open list: Contains nodes that have not been expanded. Sorted by evaluation function f(n) = g(n) + h(n). Initially contains the root. To expand a node you get the first one from the list. Besides you add the successors to the list.
Closed list: Contains nodes that have been expanded. When you are going to expand a node you check that it is not in the closed list. If it is you discard it.
Hope this helps.
Related
My professor showed the following problem in class and mentioned that the answer is O(1) while mine was quit different, I hope to get some help knowing of what mistakes did I made.
Question:
Calculate the Amortized Time Complexity for F method in AVL tree, when we start from the minimal node and each time we call F over the last found member.
Description of F: when we are at specific node F continues just like inorder traversal starting from the current one until the next one in inorder traversal for the next call.
What I did:
First I took an random series of m calls to F.
I said for the first one we need O(log n) - to find the most minimal node then for the next node we need to do inorder again but continues one more step so O(log n)+1 an so on until I scan m elements.
Which gets me to:
To calculate Amortized Time we do T(m)/m then I get:
Which isn't O(1) for sure.
The algorithm doesn't start by searching for any node, but instead is already passed a node and will start from that node. E.g. pseudocode for F would look like this:
F(n):
if n has right child
n = right child of n
while n has left child
n = left child of n
return n
else
prev = n
cur = parent of n
while prev is right child of cur and cur is not root
prev = cur
cur = parent of prev
if cur is root and prev is right child of cur
error "Reached end of traversal"
else
return cur
The above code basically does an in-order traversal of a tree starting from a node until the next node is reached.
Amortized runtime:
Pick an arbitrary tree and m. Let r_0 be the lowest common ancestor of all nodes visited by F. Now define r_(n + 1) as the lowest common ancestor of all nodes in the right subtree of r_n that will be returned by F. This recursion bottoms out for r_u, which will be the m-th node in in-order traversal. Any r_n will be returned by F in some iteration, so all nodes in the left subtree of r_n will be returned by F as well.
All nodes that will be visited by F are either also returned by F or are nodes on the path from r_0 to r_u. Since r_0 is an ancestor of r_1 and r_1 is an ancestor of r_2, etc., the path from r_0 to r_u can be at most as long as the right subtree is high. The height of the tree is limited by log_phi(m + 2), so in total at most
m + log_phi(m + 2)
nodes will be visited during m iterations of F. All nodes visited by F form a subtree, so there are at most 2 * (m + log_phi(m + 2)) edges that will be traversed by the algorithm, leading to an amortized runtime-complexity of
2 * (m + log_phi(m + 2)) / m = 2 + 2 * log_phi(m + 2) / m = O(1)
(The above bounds are in reality considerably tighter, but for the calculation presented here completely sufficient)
You are given a directed acyclic graph G = (V,E). Each directed edge e ∈ E has weight w_e associated with it. Given two vertices s,t ∈ V such that s has no incoming edge and t has no outgoing edge, we are interested in a maximum weight directed path that begins at s and ends at t. The weight of a path is the sum of the weights of the directed edges comprising the path. (A directed graph is acyclic if it has no directed cycles in it.)
How do I solve it by using dynamic programming techniques? I have been stuck for a while, any tips is appreciated:D
The key here is an understanding that "dynamic programming" just means that for some function f(x), any repeated executions of f for some distinct input x results in either a different result or execution path. From this definition we can consider caching to be an instance of dynamic programming.
So let's start with an implementation without dynamic programming. Using backtracking, we can perform a DEPTH FIRST (this will be important later) set of traversals starting from s and ending at t.
let P(a,b) be a path from a->b
let w(p) be the total weight of some path p
let K be the exhaustive set of P(s,t) // a.k.a every path that exists
// Returns Max(p) p ∈ K
function findMaxPath(G)
return findMaxPath(s)
// Returns Max(P(n, t))
function findMaxPath(n)
if (n === t)
return an empty path // we are already at the target
declare p = null
for each e of n // every outgoing edge
let q = G(n, e)
let l = findMaxPath(q) // get the maximum path from the neighbor indice to t
if (l == null) continue
l = e + l // prepend the outgoing end to the max path of the child node
if (w(l) > w(p)) p = l // this is most expensive outgoing end that eventually reaches t
return p // return null if we can't reach t
The problem with this solution is that it is really slow. In particular, you end up recalculating a LOT of paths. Take the following graph:
In the process of calculating the path from P(s, t), you end up executing findMaxPath(n) the following times for each n
findMaxPath(s) 1
findMaxPath(a) 1
findMaxPath(b) 1
findMaxPath(c) 1
findMaxPath(d) 3
findMaxPath(e) 3
findMaxPath(f) 3
findMaxPath(g) 3
findMaxPath(h) 9 (wow!)
In this example findMaxPath(h) has to get calculated 9 times, a number that can increase dramatically in more complex topologies (this one is fairly trivial).
So to increase execution time, we can keep track of a "cache" of calls to findMaxPath(n). This is "dynamic" because the execution path of a function changes over time with identical variable inputs.
let P(a,b) be a path from a->b
let w(p) be the total weight of some path p
let K(n) be the exhaustive set of P(n,t) // a.k.a every path that exists
let C be a cache of Max(w(p)) p ∈ K(n)
// Returns Max(w(p)) p ∈ K(s)
function findMaxPath(G)
return findMaxPath(s)
// Returns Max(P(n, t))
function findMaxPath(n)
if exists C[n]
return C[n] // we already know the most expensive path from n->t
if (n === t)
return an empty path // we are already at the target
declare p = null
for each e of n // every outgoing edge
let q = G(n, e)
let l = findMaxPath(q) // get the maximum path from the neighbor indice to t
if (l == null) continue
l = e + l // prepend the outgoing end to the max path of the child node
if (w(l) > w(p)) p = l // this is most expensive outgoing end that eventually reaches t
C[n] = p
return p // return null if we can't reach t
This gives us a total cache "hit" of 16/25 making the runtime substantially faster
Let G = (V, E) be a directed graph with nodes v_1, v_2,..., v_n. We say that G is an ordered graph if it has the following properties.
Each edge goes from a node with lower index to a node with a higher index. That is, every directed edge has the form (v_i, v_j) with i < j.
Each node except v_n has at least one edge leaving it. That is, for every node v_i, there is at least one edge of the form (v_i, v_j).
Give an efficient algorithm that takes an ordered graph G and returns the length of the longest path that begins at v_1 and ends at v_n.
If you want to see the nice latex version: here
My attempt:
Dynamic programming. Opt(i) = max {Opt(j)} + 1. for all j such such j is reachable from i.
Is there perhaps a better way to do this? I think even with memoization my algorithm will still be exponential. (this is just from an old midterm review I found online)
Your approach is right, you will have to do
Opt(i) = max {Opt(j)} + 1} for all j such that j is reachable from i
However, this is exponential only if you run it without memoization. With memoization, you will have the memoized optimal value for every node j, j > i, when you are on node i.
For the worst case complexity, let us assume that every two nodes that could be connected are connected. This means, v_1 is connected with (v_2, v_3, ... v_n); v_i is connected with (v_(i+1), v_(i+2), ... v_n).
Number of Vertices (V) = n
Hence, number of edges (E) = n*(n+1)/2 = O(V^2)
Let us focus our attention on a vertex v_k. For this vertex, we have to go through the already derived optimal values of (n-k) nodes.
Number of ways of reaching v_k directly = (k-1)
Hence worst case time complexity => sigma((k-1)*(n-k)) from k=1 to k=n, which is a sigma of power 2 polynomical, and hence will result in O(n^3) Time complexity.
Simplistically, the worst case time complexity is O(n^3) == O(V^3) == O(E) * O(V) == O(EV).
Thanks to the first property, this problem can be solved O(V^2) or even better with O(E) where V is the number of vertices and E is the number of edges. Indeed, it uses the dynamic programming approach which is quiet similar with the one you gives. Let opt[i] be the length of the longest path for v_1 to v_i. Then
opt[i] = max(opt[j]) + 1 where j < i and we v_i and v_j is connected,
using this equation, it can be solved in O(V^2).
Even better, we can solve this in another order.
int LongestPath() {
for (int v = 1; v <= V; ++v) opt[v] = -1;
opt[1] = 0;
for (int v = 1; v <= V; ++v) {
if (opt[v] >= 0) {
/* Each edge can be visited at most once,
thus the runtime time is bounded by |E|.
*/
for_each( v' can be reached from v)
opt[v'] = max(opt[v]+1, opt[v']);
}
}
return opt[V];
}
I want to find number of paths between two nodes in a DAG. O(V^2) and O(V+E) are acceptable.
O(V+E) reminds me to somehow use BFS or DFS but I don't know how.
Can somebody help?
Do a topological sort of the DAG, then scan the vertices from the target backwards to the source. For each vertex v, keep a count of the number of paths from v to the target. When you get to the source, the value of that count is the answer. That is O(V+E).
The number of distinct paths from node u to v is the sum of distinct paths from nodes x to v, where x is a direct descendant of u.
Store the number of paths to target node v for each node (temporary set to 0), go from v (here the value is 1) using opposite orientation and recompute this value for each node (sum the value of all descendants) until you reach u.
If you process the nodes in topological order (again opposite orientation) you are guaranteed that all direct descendants are already computed when you visit given node.
Hope it helps.
This question has been asked elsewhere on SO, but nowhere has the simpler solution of using DFS + DP been mentioned; all solutions seem to use topological sorting. The simpler solution goes like this (paths from s to t):
Add a field to the vertex representation to hold an integer count. Initially, set vertex t’s count to 1 and other vertices’ count to 0. Start running DFS with s as the start vertex. When t is discovered, it should be immediately marked as finished (BLACK), without further processing starting from it. Subsequently, each time DFS finishes a vertex v, set v’s count to the sum of the counts of all vertices adjacent to v. When DFS finishes vertex s, stop and return the count computed for s. The time complexity of this solution is O(V+E).
Pseudo-code:
simple_path (s, t)
if (s == t)
return 1
else if (path_count != NULL)
return path_count
else
path_count = 0
for each node w ϵ adj[s]
do path_count = path_count + simple_path(w, t)
end
return path_count
end
I was asked this question in an interview, but I couldn't come up with any decent solution. So, I told them the naive approach of finding all the cycles then picking the cycle with the least length.
I'm curious to know what is an efficient solution to this problem.
You can easily modify Floyd-Warshall algorithm. (If you're not familiar with graph theory at all, I suggest checking it out, e.g. getting a copy of Introduction to Algorithms).
Traditionally, you start path[i][i] = 0 for each i. But you can instead start from path[i][i] = INFINITY. It won't affect algorithm itself, as those zeroes weren't used in computation anyway (since path path[i][j] will never change for k == i or k == j).
In the end, path[i][i] is the length the shortest cycle going through i. Consequently, you need to find min(path[i][i]) for all i. And if you want cycle itself (not only its length), you can do it just like it's usually done with normal paths: by memorizing k during execution of algorithm.
In addition, you can also use Dijkstra's algorithm to find a shortest cycle going through any given node. If you run this modified Dijkstra for each node, you'll get the same result as with Floyd-Warshall. And since each Dijkstra is O(n^2), you'll get the same O(n^3) overall complexity.
The pseudo code is a simple modification of Dijkstra's algorithm.
for all u in V:
for all v in V:
path[u][v] = infinity
for all s in V:
path[s][s] = 0
H = makequeue (V) .. using pathvalues in path[s] array as keys
while H is not empty:
u = deletemin(H)
for all edges (u,v) in E:
if path[s][v] > path[s][u] + l(u, v) or path[s][s] == 0:
path[s][v] = path[s][u] + l(u,v)
decreaseKey(H, v)
lengthMinCycle = INT_MAX
for all v in V:
if path[v][v] < lengthMinCycle & path[v][v] != 0 :
lengthMinCycle = path[v][v]
if lengthMinCycle == INT_MAX:
print(“The graph is acyclic.”)
else:
print(“Length of minimum cycle is ”, lengthMinCycle)
Time Complexity: O(|V|^3)
Perform DFS
During DFS keep the track of the type of the edge
Type of edges are Tree Edge, Back Edge, Down Edge and Parent Edge
Keep track when you get a Back Edge and have another counter for getting length.
See Algorithms in C++ Part5 - Robert Sedgwick for more details
What you will have to do is to assign another weight to each node which is always 1. Now run any shortest path algorithm from one node to the same node using these weights. But while considering the intermediate paths, you will have to ignore the paths whose actual weights are negative.
Below is a simple modification of Floyd - Warshell algorithm.
V = 4
INF = 999999
def minimumCycleLength(graph):
dist = [[0]*V for i in range(V)]
for i in range(V):
for j in range(V):
dist[i][j] = graph[i][j];
for k in range(V):
for i in range(V):
for j in range(V):
dist[i][j] = min(dist[i][j] ,dist[i][k]+ dist[k][j])
length = INF
for i in range(V):
for j in range(V):
length = min(length,dist[i][j])
return length
graph = [ [INF, 1, 1,INF],
[INF, INF, 1,INF],
[1, INF, INF, 1],
[INF, INF, INF, 1] ]
length = minimumCycleLength(graph)
print length