A* Algorithm Search - algorithm

I have a tree like the one below. Numbers on the edges are costs (g) and number in the nodes are the estimated distance from the goal from the heuristic function (h). The goal is shaded in grey.
If I start at S, the route, would the traversal for A-star search (f(x) = g(x) + h(x)) be as follow: S>B>H>M ?
This is a funny question because if we are looking instead at the Greedy search algo where the function for determining the next move = f(x) = h(x) we will consider the values in the nodes only and select the least one. Based on this we will start at S and then go on to A (lowest value best), but the leftmost branch is incorrect as it will not lead to any of the goal nodes. Would I be correct to assume that a greedy search will fail with this tree?

Firstly, this is not a tree, it's a DAG, because some nodes have multiple parents.
Secondly, yes, A* will return the correct result with this heuristic, because the heuristic is admissible (ie. it never overestimates the true cost). If that were not true, A* might not return the correct result.

No, the greedy search will walk through S->A->D->B->F.
Heuristic search is just try to speed up the search but it won't make the search fail, the worst case is just it takes longer time than no heuristic.

Related

Which heuristic function is used in Best-First Search?

So the major difference between Best-First Search (informed) and Uniform-Cost Search (uninformed) is that in BFS, we use a heuristic function to determine which node to go next. In UCS, we always take the lowest cost that is calculated from my initial state.
What is the heuristic function used in Best-First Search? It is mentioned everywhere that the heuristic function is h(n) = f(n), but what's f(n) exactly and how do I get its value if my "map" has many nodes and only the cost of the paths from one node to another?
A heuristic function is not a unique thing. It is a decision you make that heavily depends on the particular properties of the problem that is being solved. And even then you can choose between different approaches (functions). Often you'll try out how a chosen function influences the quality of the solutions found in sample cases, and test alternatives.
For example, if the graph is an Euclidean graph, where nodes represent coordinates in n-dimensional space, and the cost of an edge is its length (distance between connected nodes), then one possible heuristic could be the distance between source and target node.
The less you can assume about a graph -- the less you know about its properties --, the harder it will be to find a suitable heuristic function.

In the A* (path finding) algorithm, why must h() be *admissible*?

According to the Wikipedia article on the A* search algorithm it says:
Here, g(n) is the known cost of getting from the initial node to n;
this value is tracked by the algorithm. h(n) is a heuristic estimate
of the cost to get from n to any goal node. For the algorithm to find
the actual shortest path, the heuristic function must be admissible,
meaning that it never overestimates the actual cost to get to the
nearest goal node. The heuristic function is problem-specific and must
be provided by the user of the algorithm.
It specifically states that the h() function must not overestimate the distance. Yet, it seems to me that in my code if my heuristic h() function returns infinity (or zero) it performs just as well and still finds the shortest path.
So why should it be admissable? Isn't a value of infinity overestimating my heuristic? I feel like my node graph is complex enough. Are there specific situations where this would make a difference that I perhaps have not reproduced in my graph?
Addendum:
See this fiddle and feel free to mess with the h function at line 221. Click on the floorplan to move the red dot.
Any of the following commented lines work equally well for the h() function.
var h = function(a,b) {
//return calcDistance(a,b);
//return 0;
return 999999;
}
If your heuristic is not admissible, then you will sometimes "settle for less than the best."
Suppose your search has just reached the goal node. Can you stop? Or is there yet to be found a better path to the goal?
If the heuristic always underestimates the shortest path from any node to the goal, you can look at each frontier node N and compare (Cost to get to N) + (Heuristic for N) to (Cost to get to the goal via the path I already found). If there isn't any node N for which it is still possible to find a shorter path to the goal, then you're done.
If your heuristic is not admissible, this reasoning will not work.

dijkstra's shortest path algorithm backtracks?

I am trying to implement dijkstra's shortest path algorithm using map reduce.
I have two questions:
Does this algorithms backtracks to re-evaluate the distances in case the distance turns out to be less for not selected path. For example-> 1->2->5 and 2->3->2 consider these values to be weights and possible 2 paths to a destination path 1 would be selected as 1<2 but overall sum of weights is less for path 2 that is 2->3->2 so want to know if dijkstra's algorithm takes care of backtracking.
Please give me a brief idea of how map and reduce function will be in this case. I am thinking of emitting in map function as and in reduce function and in reduce function I iterate over associated weights to find the least weighted neighbour ..but after that how it function. Please give me a good idea of how it happens from scratch in a cluster and what happens internally.
Dijkstra's does not perform backtracking to re-evaluate the distances.
http://upload.wikimedia.org/wikipedia/commons/5/57/Dijkstra_Animation.gif
that gif should help you understand how Dijkstra's algorithm re-evaluate distances. It avoids the task of backtracking by storing the "shortest path to node n" inside node n.
During traversal if the algorithm comes across node n again, it will simply compare the current "distance" it traversed to get to node n and compare it to the data stored in node n. If it is greater it ignores it and if it is lesser it keeps replaces the data in node n.
Dijkstra's however has a limitation when dealing with negative edges since you could end up with a negative cycle in some circumstances, so that is something you should be wary of.

Difference and advantages between dijkstra & A star [duplicate]

This question already has answers here:
How does Dijkstra's Algorithm and A-Star compare?
(12 answers)
Closed 4 years ago.
I read this:
http://en.wikipedia.org/wiki/A*_search_algorithm
It says A* is faster than using dijkstra and uses best-first-search to speed things up.
If I need the algorithm to run in milliseconds, when does A* become the most prominent choice.
From what I understand it does not necessarily return the best results.
If I need quick results, is it better to pre-compute the paths? It may take megabytes of space to store them.
It says A* is faster than using dijkstra and uses best-first-search to
speed things up.
A* is basically an informed variation of Dijkstra.
A* is considered a "best first search" because it greedily chooses which vertex to explore next, according to the value of f(v) [f(v) = h(v) + g(v)] - where h is the heuristic and g is the cost so far.
Note that if you use a non informative heuristic function: h(v) = 0 for each v: you get that A* chooses which vertex to develop next according to the "so far cost" (g(v)) only, same as Dijkstra's algorithm - so if h(v) = 0, A* defaults to Dijkstra's Algorithm.
If I need the algorithm to run in milliseconds, when does A* become
the most prominent choice.
Not quite, it depends on a lot of things. If you have a decent heuristic function - from my personal experience, greedy best first (choosing according to the heuristic function alone) - is usually significantly faster than A* (but is not even near optimal).
From what I understand it does not necessarily return the best
results.
A* is both complete (finds a path if one exists) and optimal (always finds the shortest path) if you use an Admissible heuristic function. If your function is not admissible - all bets are off.
If I need quick results, is it better to pre-compute the paths? It may
take megabytes of space to store them.
This is a common optimization done on some problems, for example on the 15-puzzle problem, but it is more advanced. A path from point A to point B is called a Macro. Some paths are very useful and should be remembered. A Machine Learning component is added to the algorithm in order to speed things up by remembering these Macros.
Note that the path from point A to point B in here is usually not on the states graph - but in the problem itself (for example, how to move a square from the lowest line to the upper line...)
To speed things up:
If you have a heuristic and you find it too slow, and you want a quicker solution, even if not optimal - A* Epsilon is usually faster then A*, while giving you a bound on the optimality of the path (how close it is to being optimal).
Dijkstra is a special case for A* (when the heuristics is zero).
A* search:
It has two cost function.
g(n): same as Dijkstra. The real cost to reach a node n.
h(n): approximate cost from node n to goal node. It is a heuristic function. This heuristic function should never overestimate the cost. That means, the real cost to reach goal node from node n should be greater than or equal h(n). It is called admissible heuristic.
The total cost of each node is calculated by f(n)=g(n)+h(n)
Dijkstra's:
It has one cost function, which is real cost value from source to each node: f(n)=g(n)
It finds the shortest path from source to every other node by considering only real cost.
A* is just like Dijkstra, the only difference is that A* tries to look for a better path by using a heuristic function which gives priority to nodes that are supposed to be better than others while Dijkstra's just explore all possible paths.
Its optimality depends on the heuristic function used, so yes it can return a non optimal result because of this and at the same time better the heuristic for your specific layout, and better will be the results (and possibly the speed).
It is meant to be faster than Dijkstra even if it requires more memory and more operations per node since it explores a lot less nodes and the gain is good in any case.
Precomputing the paths could be the only way if you need realtime results and the graph is quite large, but usually you wish to pathfind the route less frequently (I'm assuming you want to calculate it often).
These algorithems can be used in pathfinding and graph traversal, the process of plotting an efficiently directed path between multiple points, called nodes.
Formula for a* is f =g + h., g means actual cost and h means heuristic cost.
formula for Dijktras is f = g. there is no heuristic cost. when we are using a* and if heuristic cost is 0 then it'll equal to Dijktras algorithem.
Short answer:
A* uses heuristics to optimize the search. That is, you are able to define a function that to some degree can estimate the cost from one node to the target. This is particulary useful when you are searching for a path on a geographical representation (map) where you can, for instance, guess the distance to the target from a given graph node. Hence, typically A* is used for path finding in games etc. Where Djikstra is used in more generic cases.
No, A* won't always give the best path.
If heuristic is the "geographical" distance, the following example might give the non optimal path.
[airport] - [road] - [start] -> [road] -> [road] -> [road] -> [road] -> [target] - [airport]
|----------------------------------------------------------------|

A* Search Modification

The Wikipedia listing for A* search states:
In other words, the closed set can be omitted (yielding a tree search algorithm) if a solution is guaranteed to exist, or if the algorithm is adapted so that new nodes are added to the open set only if they have a lower f value than at any previous iteration.
However, in doing so, I have found that I receive erroneous results in an otherwise functional A* search implementation. Can someone shed some light on how one would make this modification?
Make sure your heuristic meets the following:
h(x) <= d(x,y) + h(y)
which means that your heuristic function should not overestimate the cost of getting from your current location to the destination or goal.
For example, if you are in a grid and you are trying to get from A to B, both points on this grid. A good heuristic function is the Euclidean distance between current location and goal:
h(x) = sqrt[ (crtX -goalX)^2 + (crtY -goalY)^2 ]
This heuristic does not overestimate because of the triangle inequality.
More on triangle inequality: http://en.wikipedia.org/wiki/Triangle_inequality
More on Euclidean distance: http://mathworld.wolfram.com/Distance.html

Resources