Breadth First Search vs A* with Manhattan Distance in a maze - algorithm

Given an initial state and a single final state in a maze, is it possible to design a maze in which breadth first search expands less nodes than A* with manhattan distance as heuristic function? The cost of expanding to all nodes is 1.

It's impossible. The intuition is your heuristic is more informed than BFS. This is also the base for proving it.
Formally:
h'(n) = 0 is also an admissible heuristic function.
BFS is basically A* using h' as its heuristic function (since it always expands based on f'=g(n) + h'(n) = g(n))
h dominates h', since for all n: h'(n) <= h(n).
Since h dominates h', and is monotone, then the nodes expanded by algorithm using h is a subset of those expanded by algorithm using h'. More info and proof in this thread, and in the original article
QED

There are three ways this can happen in general:
When A* gets an inconsistent heuristic. A* can do up to 2^N expansions of N states with an inconsistent heuristic. For an undirected graph an inconsistent heuristic has some states with |h(a, g)-h(b, g)| > c(a, b) even if the heuristic is admissible (h(a, g) <= c(a, g)). Manhattan distance is consistent, so this won't work in your example.
In a breadth-first search it is assumed that all costs are 1. Thus, when the goal is generated, it can terminate immediately knowing that it has the optimal cost to the goal. A* typically does not terminate until the goal is selected for expansion. Note that if A* was guaranteed that all edges have uniform cost (or some minimum cost), then A* could do the same.
A* is only optimal up to necessary expansions - those with f(n) < C* where C* is the optimal solution cost. If a problem instance has more than one state with f(n) = C* then A* could do worse if it did poorly with tie-breaking and BFS got lucky with tie-breaking.
So, now consider this example:
....._.
SXXXXG.
.......
Here, a . is open space, as is the _ cell. S is the start and G is the goal. The X states are impassible and the _ cell can be reached/expanded, but you can't go down from that state to the goal.
If A* is unlucky it will expand the top route first (assume it expands largest g-cost first, but this also works with other tie-breaking). That route looks promising until it reaches the end, after which A* will have to expand the full alternate route.
Assume BFS gets lucky and expands the state below G before the _ state. Then, BFS will be able to terminate without expanding the _ state, and do fewer expansions that A*.
To be clear, label the states in the previous example as follows:
abcdefg
SXXXXG.
hijklmn
A* with unlucky tie-breaking will expand a, b, c, d, e, f and generate g (but not expand it). Then it will continue and expand h, i, j, k, l, and m, after which it can terminate with the solution.
BFS with lucky tie-breaking will expand h, a, i, b, j, c, k, d, l, e, and m and then terminate with the optimal solution. BFS does one less expansion than A*, because it doesn't expand f.
So, yes, it is possible for BFS to beat A* if the tie-breaking works out in favor of BFS.
Such examples will always be possible unless A* and BFS always expand states in the same order, or if further restrictions are put on the maze so that the heuristic is always perfect next to the goal.
See this paper for common misconceptions about A* search. One addresses the misconception that a better heuristic will do less work, and another addresses the misconception that A* with a 0-heuristic is the same as BFS.
--
Note 1: There is some discussion in the comments about whether the _ state is expanded if it doesn't have any successors. The original A* paper states:
Starting with the node s, they generate some part of the subgraph G, by repetitive application of the successor operator R. During the course of the algorithm, if R is applied to a node, we say that the algorithm has expanded that node.
Thus, if we apply the R operator and find no successors, we still have expanded the node. (They use a greek letter which I've replaced with R.) But, with a small edit to the map above, the _ state does have one successor, so _ is expanded by A* and a successor is generated (not the goal).
Note 2:
There is a question in the comments about Theorem 3 in the A* paper. That theorem states that there always exists some tie-breaking scheme for A* that will be at least as good as any other less informed algorithm. There are two problems with this theorem. First, it only states that A* is capable of beating another algorithm given the right tie-breaking, not that it will always beat every other algorithm. The second problem is that BFS is more informed than A*. BFS knows all edges have unit cost and A* does not. So, the theorem does not apply to BFS, because BFS has more information.
Note 3:
The question only asks "Is this possible." My answer provided here shows the precise conditions under which it is possible. The other answers (one of which has now been deleted) categorically state that it can never happen, and thus are incorrect.

Related

dijkstra algorithm, run only one time for shortest paths of some nodes (not two, not the whole graph)

So, dijkstra algorithm is (the best one) used to search for the shortest path of a weighted(without negative) and connected graph. Dijkstra algorithm can be used to find the shortest path of two points/vertices. AND it can be used to find the shortest path of all the vertices.
questions:
is my understanding correct?
Can it also be used to find the shortest paths of some pair of vertices ? for example, the graph has A, B, C, D, E, F, G, H, I, J, K. and we are only interested in the shortest paths of A,B ; C,K. it that possible we only turn the algorithm only one time to find out two paths?
You will need to run two Dijkstras. One starting from A and one from C.
What you could do is to run it from {A, C} (a set Dijkstra) until you have found paths to B and K. But that is no guarantee that the resulting paths are actually from A to B and C to K, it could as well be C, B and C, K. Actually, all combinations of {A, C}, B and {A, C}, K are then possible.
the best one
Not at all. It is a good concept and heavily used for many other similar algorithms. There are many variants, like A*, Arc-Flags, and others. But raw Dijkstra is super slow since it equally searches in all directions.
Imagine a query where you have modeled the whole world. Your destination is 1 hour away. Then Dijkstra will find shortest paths to all nodes that can be reached in 1 hour. So it will also consider a short flight to your neighboring country, even if it's the totally wrong direction. The algorithm A* is a simple modification of Dijkstra that tries to improve on that by introducing a heuristic function that is able to make (hopefully) good guesses about shortest path distances. By that your Dijkstra gets a sense of direction and tries to first prioritize a search into the direction of the destination.
A simple heuristic is as-the-crows-fly. Note that this heuristic does not perform well on road networks and especially bad on transit networks (you often need to drive 10 mins into the wrong direction to get on a highway that lets you arrive earlier in the end, or you need to first drive to some big city to get a good fast train). Other heuristics involve computing landmarks, they yield pretty good results but need a lot of pre-computation and space (usually not a problem).
First of all Dijkstra is not the best algorithm and many heuristics outperform it in most practical implementations.
we are only interested in the shortest paths of A,B ; C,K
Looks like you could take a look at A* algorithm which eliminates the nodes that will not lead you to your final destination with least cost. But it requires a good estimation (and hence the word heuristic) of the distances of various nodes to the destination.
Coming to your main question, the answer is both yes and no with some overheads. We all know that whenever a node gets removed from the min-heap its done for the algorithm.
So, as soon as your node gets removed from the min-heap, just terminate it, but that does not mean that it did find shortest distance only for the given pair. All the nodes that got removed from the min-heap before destination node are also the ones for which it has found the shortest path.
Also its possible that your destination node turns out to be the last node that is removed from the min-heap,which basically means that you have computed shortest paths from source to all nodes.

Modifying A* to find a path to closest of multiple goals on a rectangular grid

The problem: finding the path to the closest of multiple goals on a rectangular grid with obstacles. Only moving up/down/left/right is allowed (no diagonals). I did see this question and answers, and this, and that, among others. I didn't see anyone use or suggest my particular approach. Do I have a major mistake in my approach?
My most important constraint here is that it is very cheap for me to represent the path (or any list, for that matter) as a "stack", or a "singly-linked-list", if you want. That is, constant time access to the top element, O(n) for reversing.
The obvious (to me) solution is to search the path from any of the goals to the starting point, using a manhattan distance heuristic. The first path from the goal to the starting point would be a shortest path to the closest goal (one of many, possibly), and I don't need to reverse the path before following it (it would be in the "correct" order, starting point on top and goal at the end).
In pseudo-code:
A*(start, goals) :
init_priority_queue(start, goals, p_queue)
return path(start, p_queue)
init_priority_queue(start, goals, q_queue) :
for (g in goals) :
h = manhattan_distance(start, g)
insert(h, g, q_queue)
path(start, p_queue) :
h, path = extract_min(q_queue)
if (top(path) == start) :
return path
else :
expand(start, path, q_queue)
return path(start, q_queue)
expand(start, path, q_queue) :
this = top(path)
for (n in next(this)) :
h = mahnattan_distance(start, n)
new_path = push(n, path)
insert(h, new_path, p_queue)
To me it seems only natural to reverse the search in this way. Is there a think-o in here?
And another question: assuming that my priority queue is stable on elements with the same priority (if two elements have the same priority, the one inserted later will come out earlier). I have left my next above undefined on purpose: randomizing the order in which the possible next tiles on a rectangular grid are returned seems a very cheap way of finding an unpredictable, rather zig-zaggy path through a rectangular area free of obstacles, instead of going along two of the edges (a zig-zag path is just statistically more probable). Is that correct?
It's correct and efficient in the big O as far as I can see (N log N as long as the heuristic is admissible and consistent, where N = number of cells of the grid, assuming you use a priority queue whose operations work in log N). The zig-zag will also work.
p.s. For these sort of problem there is a more efficient "priority queue" that works in O(1). By these sort of problem I mean the case where the effective distance between every pair of nodes is a very small constant (3 in this problem).
Edit: as requested in the comment, here are the details for a constant time "priority queue" for this problem.
First, transform the graph into the following graph: Let the potential of nodes in the graph (i.e., cell in a grid) be the Manhattan Distance from the node to the goal (i.e., the heuristic). We call the potential of node i as P(i). Previously, there is an edge between adjacent cells and its weight is 1. In the modified graph, the weight w(i, j) is changed into w(i, j) - P(i) + P(j). This is exactly the same graph as in the proof to why A* is optimal and terminates in polynomial time in the case the heuristic is admissible and consistent. Note that Manhattan Distance heuristic for this problem is both admissible and consistent.
The first key observation is that A* in the original graph is exactly the same with Dijkstra in the modified graph. This is since the "value" of node i in the modified graph is exactly the distance from the origin node plus P(i). The second key observation is that the weight of every edge in our transformed graph is either 0 or 2. Thus, we can simulate the A* by using a "deque" (or a bidirectional linked list) instead of an ordinary queue: whenever we encounter an edge with weight 0, push it to the front of the queue, and whenever we encounter an edge with weight 2, push it to the end of the queue.
Thus, this algorithm simulates A* and works in linear time in the worst case.

Finding a shortest path that passes through some arbitrary sequence of nodes?

In this earlier question the OP asked how to find a shortest path in a graph that goes from u to v and also passes through some node w. The accepted answer, which is quite good, was to run Dijkstra's algorithm twice - once to get from u to w and once to get from w to v. This has time complexity equal to two calls to Dijkstra's algorithm, which is O(m + n log n).
Now consider a related question - you are given a sequence of nodes u1, u2, ..., uk and want to find the shortest path from u1 to uk such that the path passes through u1, u2, ..., uk in order. Clearly this could be done by running k-1 instances of Dijkstra's algorithm, one for each pair of adjacent vertices, then concatenating the shortest paths together. This takes time O(km + k n log n). Alternatively, you could use an all-pairs shortest paths algorithm like Johnson's algorithm to compute all shortest paths, then concatenate the appropriate shortest paths together in O(mn + n2 log n) time, which is good for k much larger than n.
My question is whether there is an algorithm for solving this problem that is faster than the above approaches when k is small. Does such an algorithm exist? Or is iterated Dijkstra's as good as it gets?
Rather than running isolated instances of Dijkstra's algorithm to find the paths u(k) -> u(k+1) one path at a time, can a single instance of a modified Dijkstra-like search be started at each node in the sequence simultaneously, with the paths formed when search regions meet "in-the-middle".
This would potentially cut down on the total number of edges visited and reduce re-traversal of edges compared to making a series of isolated calls to Dijkstra's algorithm.
An easy example would be finding the path between two nodes. It would be better to expand the search regions about both nodes than just expanding about one. In the case of a uniform graph, the second option would give a search region with radius equal to the distance between the nodes, the first option would give two regions of half the radius - less overall search area.
Just a thought.
EDIT: I guess I'm talking about a multi-directional variant of a bi-directional search, with as many directions as there are nodes in the sequence {u(1), u(2), ..., u(m)}.
I don't see how we can do too much better, here's all I can think of. Assuming the graph is undirected, the shortest path from any node u to node v would be the same as that from v to u (in reverse of course).
Now, for your case of a shortest path in the order u1 u2 u3.. un, we could run Djikstra's algorithm on u2 (and find the shortest paths u1-u2, u2-u3 in one run), then on u4 (for u3-u4 and u4-u5), then u6.. and so on. This reduces the number of times you apply Djikstra's by roughly a half. Note that complexity wise, this is identical to the original solution.
You get the shortest path from one vertex to all the other in the graph by one call to Dijkstra's algorithm. Thus you only need to do a search for each unique starting vertex so repeated vertices does not make the problem any harder.

Negative weights using Dijkstra's Algorithm

I am trying to understand why Dijkstra's algorithm will not work with negative weights. Reading an example on Shortest Paths, I am trying to figure out the following scenario:
2
A-------B
\ /
3 \ / -2
\ /
C
From the website:
Assuming the edges are all directed from left to right, If we start
with A, Dijkstra's algorithm will choose the edge (A,x) minimizing
d(A,A)+length(edge), namely (A,B). It then sets d(A,B)=2 and chooses
another edge (y,C) minimizing d(A,y)+d(y,C); the only choice is (A,C)
and it sets d(A,C)=3. But it never finds the shortest path from A to
B, via C, with total length 1.
I can not understand why using the following implementation of Dijkstra, d[B] will not be updated to 1 (When the algorithm reaches vertex C, it will run a relax on B, see that the d[B] equals to 2, and therefore update its value to 1).
Dijkstra(G, w, s) {
Initialize-Single-Source(G, s)
S ← Ø
Q ← V[G]//priority queue by d[v]
while Q ≠ Ø do
u ← Extract-Min(Q)
S ← S U {u}
for each vertex v in Adj[u] do
Relax(u, v)
}
Initialize-Single-Source(G, s) {
for each vertex v  V(G)
d[v] ← ∞
π[v] ← NIL
d[s] ← 0
}
Relax(u, v) {
//update only if we found a strictly shortest path
if d[v] > d[u] + w(u,v)
d[v] ← d[u] + w(u,v)
π[v] ← u
Update(Q, v)
}
Thanks,
Meir
The algorithm you have suggested will indeed find the shortest path in this graph, but not all graphs in general. For example, consider this graph:
Let's trace through the execution of your algorithm.
First, you set d(A) to 0 and the other distances to ∞.
You then expand out node A, setting d(B) to 1, d(C) to 0, and d(D) to 99.
Next, you expand out C, with no net changes.
You then expand out B, which has no effect.
Finally, you expand D, which changes d(B) to -201.
Notice that at the end of this, though, that d(C) is still 0, even though the shortest path to C has length -200. This means that your algorithm doesn't compute the correct distances to all the nodes. Moreover, even if you were to store back pointers saying how to get from each node to the start node A, you'd end taking the wrong path back from C to A.
The reason for this is that Dijkstra's algorithm (and your algorithm) are greedy algorithms that assume that once they've computed the distance to some node, the distance found must be the optimal distance. In other words, the algorithm doesn't allow itself to take the distance of a node it has expanded and change what that distance is. In the case of negative edges, your algorithm, and Dijkstra's algorithm, can be "surprised" by seeing a negative-cost edge that would indeed decrease the cost of the best path from the starting node to some other node.
Note, that Dijkstra works even for negative weights, if the Graph has no negative cycles, i.e. cycles whose summed up weight is less than zero.
Of course one might ask, why in the example made by templatetypedef Dijkstra fails even though there are no negative cycles, infact not even cycles. That is because he is using another stop criterion, that holds the algorithm as soon as the target node is reached (or all nodes have been settled once, he did not specify that exactly). In a graph without negative weights this works fine.
If one is using the alternative stop criterion, which stops the algorithm when the priority-queue (heap) runs empty (this stop criterion was also used in the question), then dijkstra will find the correct distance even for graphs with negative weights but without negative cycles.
However, in this case, the asymptotic time bound of dijkstra for graphs without negative cycles is lost. This is because a previously settled node can be reinserted into the heap when a better distance is found due to negative weights. This property is called label correcting.
TL;DR: The answer depends on your implementation. For the pseudo code you posted, it works with negative weights.
Variants of Dijkstra's Algorithm
The key is there are 3 kinds of implementation of Dijkstra's algorithm, but all the answers under this question ignore the differences among these variants.
Using a nested for-loop to relax vertices. This is the easiest way to implement Dijkstra's algorithm. The time complexity is O(V^2).
Priority-queue/heap based implementation + NO re-entrance allowed, where re-entrance means a relaxed vertex can be pushed into the priority-queue again to be relaxed again later.
Priority-queue/heap based implementation + re-entrance allowed.
Version 1 & 2 will fail on graphs with negative weights (if you get the correct answer in such cases, it is just a coincidence), but version 3 still works.
The pseudo code posted under the original problem is the version 3 above, so it works with negative weights.
Here is a good reference from Algorithm (4th edition), which says (and contains the java implementation of version 2 & 3 I mentioned above):
Q. Does Dijkstra's algorithm work with negative weights?
A. Yes and no. There are two shortest paths algorithms known as Dijkstra's algorithm, depending on whether a vertex can be enqueued on the priority queue more than once. When the weights are nonnegative, the two versions coincide (as no vertex will be enqueued more than once). The version implemented in DijkstraSP.java (which allows a vertex to be enqueued more than once) is correct in the presence of negative edge weights (but no negative cycles) but its running time is exponential in the worst case. (We note that DijkstraSP.java throws an exception if the edge-weighted digraph has an edge with a negative weight, so that a programmer is not surprised by this exponential behavior.) If we modify DijkstraSP.java so that a vertex cannot be enqueued more than once (e.g., using a marked[] array to mark those vertices that have been relaxed), then the algorithm is guaranteed to run in E log V time but it may yield incorrect results when there are edges with negative weights.
For more implementation details and the connection of version 3 with Bellman-Ford algorithm, please see this answer from zhihu. It is also my answer (but in Chinese). Currently I don't have time to translate it into English. I really appreciate it if someone could do this and edit this answer on stackoverflow.
you did not use S anywhere in your algorithm (besides modifying it). the idea of dijkstra is once a vertex is on S, it will not be modified ever again. in this case, once B is inside S, you will not reach it again via C.
this fact ensures the complexity of O(E+VlogV) [otherwise, you will repeat edges more then once, and vertices more then once]
in other words, the algorithm you posted, might not be in O(E+VlogV), as promised by dijkstra's algorithm.
Since Dijkstra is a Greedy approach, once a vertice is marked as visited for this loop, it would never be reevaluated again even if there's another path with less cost to reach it later on. And such issue could only happen when negative edges exist in the graph.
A greedy algorithm, as the name suggests, always makes the choice that seems to be the best at that moment. Assume that you have an objective function that needs to be optimized (either maximized or minimized) at a given point. A Greedy algorithm makes greedy choices at each step to ensure that the objective function is optimized. The Greedy algorithm has only one shot to compute the optimal solution so that it never goes back and reverses the decision.
Consider what happens if you go back and forth between B and C...voila
(relevant only if the graph is not directed)
Edited:
I believe the problem has to do with the fact that the path with AC* can only be better than AB with the existence of negative weight edges, so it doesn't matter where you go after AC, with the assumption of non-negative weight edges it is impossible to find a path better than AB once you chose to reach B after going AC.
"2) Can we use Dijksra’s algorithm for shortest paths for graphs with negative weights – one idea can be, calculate the minimum weight value, add a positive value (equal to absolute value of minimum weight value) to all weights and run the Dijksra’s algorithm for the modified graph. Will this algorithm work?"
This absolutely doesn't work unless all shortest paths have same length. For example given a shortest path of length two edges, and after adding absolute value to each edge, then the total path cost is increased by 2 * |max negative weight|. On the other hand another path of length three edges, so the path cost is increased by 3 * |max negative weight|. Hence, all distinct paths are increased by different amounts.
You can use dijkstra's algorithm with negative edges not including negative cycle, but you must allow a vertex can be visited multiple times and that version will lose it's fast time complexity.
In that case practically I've seen it's better to use SPFA algorithm which have normal queue and can handle negative edges.
I will be just combining all of the comments to give a better understanding of this problem.
There can be two ways of using Dijkstra's algorithms :
Marking the nodes that have already found the minimum distance from the source (faster algorithm since we won't be revisiting nodes whose shortest path have been found already)
Not marking the nodes that have already found the minimum distance from the source (a bit slower than the above)
Now the question arises, what if we don't mark the nodes so that we can find shortest path including those containing negative weights ?
The answer is simple. Consider a case when you only have negative weights in the graph:
)
Now, if you start from the node 0 (Source), you will have steps as (here I'm not marking the nodes):
0->0 as 0, 0->1 as inf , 0->2 as inf in the beginning
0->1 as -1
0->2 as -5
0->0 as -8 (since we are not relaxing nodes)
0->1 as -9 .. and so on
This loop will go on forever, therefore Dijkstra's algorithm fails to find the minimum distance in case of negative weights (considering all the cases).
That's why Bellman Ford Algo is used to find the shortest path in case of negative weights, as it will stop the loop in case of negative cycle.

How is Manhattan distance an admissible heuristic?

Ain't it true that while counting the moves for 1 tile can lead to other tiles getting to their goal state? And hence counting for each tile can give us a count more than the minimum moves required to reach the goal state?
This question is in context of Manhattan distance for 15-Puzzle.
Here is the Question in different words:
Can we use Manhattan distance as an admissible heuristic for N-Puzzle. To implement A* search we need an admissible heuristic. Is Manhattan heuristic a candidate? If yes, how do you counter the above argument (the first 3 sentences in the question)?
Definitions: A* is a kind of search algorithm. It uses a heuristic function to determine the estimated distance to the goal. As long as this heuristic function never overestimates the distance to the goal, the algorithm will find the shortest path, probably faster than breadth-first search would. A heuristic that satisfies that condition is admissible.
Admissable heuristics must not overestimate the number of moves to solve this problem. Since you can only move the blocks 1 at a time and in only one of 4 directions, the optimal scenario for each block is that it has a clear, unobstructed path to its goal state. This is a M.D. of 1.
The rest of the states for a pair of blocks is sub-optimal, meaning it will take more moves than the M.D. to get the block in the right place. Thus, the heuristic never over-estimate and is admissible.
I will delete when someone posts a correct, formal version of this.
Formal Proof:
By definition of h, h(s∗) = 0, if s∗ is the goal state. Assume for proof by contradiction
that C∗ < h(s0) for some initial state s0. Note that, since each action can move
only one tile, performing an action can at most reduce h by one. Since the goal can
be reached in C∗ actions, we have h(s∗) ≥ h(s0) − C∗ > 0, which brings us to a
contradiction since h(s∗) should be zero. Therefore, we must have h(s0) ≤ C∗ forall
s0, and h is admissible.
(Source: University of California, Irvine)

Resources