Heuristic function for finding the path using A star - algorithm

I am trying to find a optimal solution for the following problem
The numbers denoted inside each node are represented as (x,y).
The adjacent nodes to a node always have a y value that is (current nodes y value +1).
There is a cost of 1 for a change in the x value when we go from one node to its adjacent
There is no cost for going from node to its adjacent, if there is no change in the value of x.
No 2 nodes with the same y value are considered adjacent.
The optimal solution is the one with the lowest cost, I'm thinking of using A* path finding algorithm for finding an optimal solution.
My question, Is A* a good choice for the this kind of problems, or should i look at any other algorithm, and also i was thinking of using recursive method to calculate the Heuristic cost, but i get the feeling that it is not a good idea.
This is the example of how I'm thinking the heuristic function will be like this
The heuristic weight of a node = Min(heuristic weight of it's child nodes)
The same goes for the child nodes too.
But as far as my knowledge goes, heuristic is meant to be an approximate, so I think I'm going in the wrong direction as far as the heuristic function is concerned

A* guarantees to find the lowest cost path in a graph with nonnegative edge path costs, provided that you use an appropriate heuristic. What makes a heuristic function appropriate?
First, it must be admissible, i. e. it should, for any node, produce either an underestimate or a correct estimate for the cost of the cheapest path from that node to any of goal nodes. This means the heuristic should never overestimate the cost to get from the node to the goal.
Note that if your heuristic computes the estimate cost of 0 for every node, then A* just turns into breadth-first exhaustive search. So h(n)=0 is still an admissible heuristic, only the worst possible one. So, of all admissible heuristics, the tighter one estimates the cost to the goal, the better it is.
Second, it must be cheap to compute. It should be certainly O(1), and should preferably look at the current node alone. Recursively evaluating the cost as you propose will make your search significantly slower, not faster!
The question of A* applicability is thus whether you can come up with a reasonably good heuristic. From your problem description, it is not clear whether you can easily come up with one.
Depending on the problem domain, A* may be very useful if requirements are relaxed. If heuristic becomes inadmissible, then you lose the guarantee of finding the best path. Depending on the degree of overestimation of the distance, hovewer, the solution might still be good enough (for problem specific definition of "good enough"). The advantage is that sometimes you can compute that "good enough" path much faster. In some cases, probabilistic estimate of heuristics works good (it can have additional constraints on it to stay in the admissible range).
So, in general, you have breadth-first search for tractable problems, next faster you have A* for tractable problems with admissible heuristic. If your problem is intractable for breadth-first exhaustive search and does not admit a heuristic, then your only option is to settle for a "good enough" suboptimal solution. Again, A* may still work with inadmissible heuristic here, or you should look at beam search varieties. The difference is that beam searches have a limit on the number of ways the graph is currently being explored, while A* limits them indirectly by choosing some subset of less costly ones. There are practical cases not solvable by A* even with relaxed admissbility, when difference in cost among different search path is slight. Beam search with its hard limit on the number of paths works more efficiently in such problems.

Seems like an overkill to use A* when something like Dijkstra's would work. Dijkstra's will only work on non-negative cost transitions which seems true in this example. If you can have negative cost transitions then Bellman-Ford should also work.

Related

will a star always return the least cost path?

I have been implementing a star recently on one of my pathfinding visualisers. A common thing that i have noticed is that while it does return the shortest path, it sometimes fails to return the least cost path. Now i am unsure as to if, this is due to some implementation error or if this is not a characteristic of the algorithm on a whole. For reference, these are the outputs of the a star and dijkstras algo respectively:
So, why is this the case? (PS:the weights are 10, normal cost is 1 for any direction of movement and, the grey patches are walls)
A* is optimal. It will always return the least-cost path. But the heuristic value must be admissible.
A* may not return the least cost path, that is correct. This is a feature of A* though, not a disadvantage. To answer your question,
Why is it the case?
A* has a heuristics function that allows the algorithm to sacrifice accuracy for much higher performance by significantly reducing the area of search. This can be seen in your image too, where the Dijkstra's area of search is much wider. A* is often more useful in real-time games, where you don't necessarily need the lowest-cost path, as long as it is "good enough". Refer to this article for more examples. I find it explains A* very clearly in simple language.
As to what is "good enough", you define this using the heuristics function. Using different function will result in different behavior, allowing you to tune the algorithm to be optimal for your needs (depending on your playing field and accuracy requirement). You can also remove the heuristics, in which case it becomes simply the Dijkstra's algorithm. Refer to this article from the same site for some examples of heuristics.
A* gives a correct result provided the heuristic gives a lower bound to the true cost. Every possible path must have cost higher or equal to the value provided by the heuristic.
The algorithm becomes identical to Dijkstra's if you use a trivial heuristic that always gives the value 0.
To add to the correct answers by Md Asaduzzaman and Henry:
If the heuristic function is admissible, then A* tree search will
always find the least cost path.
(source)
An admissible heuristic is optimistic is an optimistic one: the heuristic cost should be less than or equal to the actual cost.
In other words, if your implementation does not return the lowest cost path, consider changing the heuristic to an admissible one.

Is A* really better than Dijkstra in real-world path finding?

I'm developing a path finding program. It is said theoretically that A* is better than Dijkstra. In fact, the latter is a special case of the former. However, when testing in the real world, I begin to doubt that is A* really better?
I used data of New York City, from 9th DIMACS Implementation Challenge - Shortest Paths, in which each node's latitude and longitude is given.
When applying A*, I need to calculate the spherical distance between two points, using Haversine Formula, which involves sin, cos, arcsin, square root. All of those are very very time-consuming.
The result is,
Using Dijkstra: 39.953 ms, expanded 256540 nodes.
Using A*, 108.475 ms, expanded 255135 nodes.
Noticing that in A*, we expanded less 1405 nodes. However, the time to compute a heuristic is much more than that saved.
To my understanding, the reason is that in a very large real graph, the weight of the heuristic will be very small, and the effect of it can be ignored, while the computing time is dominating.
I think you're more or less missing the point of A*. It is intended to be a very performant algorithm, partially by intentionally doing more work but with cheap heuristics, and you're kind of tearing that to bits when burdening it with a heavy extremely accurate prediction heuristic.
For node selection in A* you should just use an approximation of distance. Simply using (latdiff^2)+(lngdiff^2) as the approximated distance should make your algorithm much more performant than Dijkstra, and not give much worse results in any real world scenario. Actually the results should even be exactly the same if you do calculate the travelled distance on a selected node properly with the Haversine. Just use a cheap algorithm for selecting potential next traversals.
A* can be reduced to Dijkstra by setting some trivial parameters. There are three possible ways in which it does not improve on Dijkstra:
The heuristic used is incorrect: it is not a so-called admissible heuristic. A* should use a heuristic which does not overestimate the distance to the goal as part of its cost function.
The heuristic is too expensive to calculate.
The real-world graph structure is special.
In the case of the latter you should try to build on existing research, e.g. "Highway Dimension, Shortest Paths, and Provably Efficient Algorithms" by Abraham et al.
Like everything else in the universe, there's a trade-off. You can take dijkstra's algorithm to precisely calculate the heuristic, but that would defeat the purpose wouldn't it?
A* is a great algorithm in that it makes you lean towards the goal having a general idea of which direction to expand first. That said, you should keep the heuristic as simple as possible because all you need is a general direction.
In fact, more precise geometric calculations that are not based on the actual data do not necessarily give you a better direction. As long as they are not based on the data, all those heuristics give you just a direction which are all equally (in)correct.
In general A* is more performant than Dijkstra's but it really depends the heuristic function you use in A*. You'll want an h(n) that's optimistic and finds the lowest cost path, h(n) should be less than the true cost. If h(n) >= cost, then you'll end up in a situation like the one you've described.
Could you post your heuristic function?
Also bear in mind that the performance of these algorithms highly depends on the nature of the graph, which is closely related to the accuracy of the heuristic.
If you compare Dijkstra vs A* when navigating out of a labyrinth where each passage corresponds to a single graph edge, there is probably very little difference in performance. On the other hand, if the graph has many edges between "far-away" nodes, the difference can be quite dramatic. Think of a robot (or an AI-controlled computer game character) navigating in almost open terrain with a few obstacles.
My point is that, even though the New York dataset you used is definitely a good example of "real-world" graph, it is not representative of all real-world path finding problems.

Difference and advantages between dijkstra & A star [duplicate]

This question already has answers here:
How does Dijkstra's Algorithm and A-Star compare?
(12 answers)
Closed 4 years ago.
I read this:
http://en.wikipedia.org/wiki/A*_search_algorithm
It says A* is faster than using dijkstra and uses best-first-search to speed things up.
If I need the algorithm to run in milliseconds, when does A* become the most prominent choice.
From what I understand it does not necessarily return the best results.
If I need quick results, is it better to pre-compute the paths? It may take megabytes of space to store them.
It says A* is faster than using dijkstra and uses best-first-search to
speed things up.
A* is basically an informed variation of Dijkstra.
A* is considered a "best first search" because it greedily chooses which vertex to explore next, according to the value of f(v) [f(v) = h(v) + g(v)] - where h is the heuristic and g is the cost so far.
Note that if you use a non informative heuristic function: h(v) = 0 for each v: you get that A* chooses which vertex to develop next according to the "so far cost" (g(v)) only, same as Dijkstra's algorithm - so if h(v) = 0, A* defaults to Dijkstra's Algorithm.
If I need the algorithm to run in milliseconds, when does A* become
the most prominent choice.
Not quite, it depends on a lot of things. If you have a decent heuristic function - from my personal experience, greedy best first (choosing according to the heuristic function alone) - is usually significantly faster than A* (but is not even near optimal).
From what I understand it does not necessarily return the best
results.
A* is both complete (finds a path if one exists) and optimal (always finds the shortest path) if you use an Admissible heuristic function. If your function is not admissible - all bets are off.
If I need quick results, is it better to pre-compute the paths? It may
take megabytes of space to store them.
This is a common optimization done on some problems, for example on the 15-puzzle problem, but it is more advanced. A path from point A to point B is called a Macro. Some paths are very useful and should be remembered. A Machine Learning component is added to the algorithm in order to speed things up by remembering these Macros.
Note that the path from point A to point B in here is usually not on the states graph - but in the problem itself (for example, how to move a square from the lowest line to the upper line...)
To speed things up:
If you have a heuristic and you find it too slow, and you want a quicker solution, even if not optimal - A* Epsilon is usually faster then A*, while giving you a bound on the optimality of the path (how close it is to being optimal).
Dijkstra is a special case for A* (when the heuristics is zero).
A* search:
It has two cost function.
g(n): same as Dijkstra. The real cost to reach a node n.
h(n): approximate cost from node n to goal node. It is a heuristic function. This heuristic function should never overestimate the cost. That means, the real cost to reach goal node from node n should be greater than or equal h(n). It is called admissible heuristic.
The total cost of each node is calculated by f(n)=g(n)+h(n)
Dijkstra's:
It has one cost function, which is real cost value from source to each node: f(n)=g(n)
It finds the shortest path from source to every other node by considering only real cost.
A* is just like Dijkstra, the only difference is that A* tries to look for a better path by using a heuristic function which gives priority to nodes that are supposed to be better than others while Dijkstra's just explore all possible paths.
Its optimality depends on the heuristic function used, so yes it can return a non optimal result because of this and at the same time better the heuristic for your specific layout, and better will be the results (and possibly the speed).
It is meant to be faster than Dijkstra even if it requires more memory and more operations per node since it explores a lot less nodes and the gain is good in any case.
Precomputing the paths could be the only way if you need realtime results and the graph is quite large, but usually you wish to pathfind the route less frequently (I'm assuming you want to calculate it often).
These algorithems can be used in pathfinding and graph traversal, the process of plotting an efficiently directed path between multiple points, called nodes.
Formula for a* is f =g + h., g means actual cost and h means heuristic cost.
formula for Dijktras is f = g. there is no heuristic cost. when we are using a* and if heuristic cost is 0 then it'll equal to Dijktras algorithem.
Short answer:
A* uses heuristics to optimize the search. That is, you are able to define a function that to some degree can estimate the cost from one node to the target. This is particulary useful when you are searching for a path on a geographical representation (map) where you can, for instance, guess the distance to the target from a given graph node. Hence, typically A* is used for path finding in games etc. Where Djikstra is used in more generic cases.
No, A* won't always give the best path.
If heuristic is the "geographical" distance, the following example might give the non optimal path.
[airport] - [road] - [start] -> [road] -> [road] -> [road] -> [road] -> [target] - [airport]
|----------------------------------------------------------------|

Heuristic and A* algorithm

I was reading about dijkstra algorithm and A* star algorithm. I know that the difference is the heuristic used. But what is a heuristic and how this influence the algorithms? Heuristic is just an way to measure the distance? But dijkstra considers the distance too? Sorry, but my question is about heuristic and what it means and why to use them... (I had read about it, but don't understant)
Other question: When should be used each one?
Thank you
In this context, a heuristic is a way of providing the algorithm with some form of extra evaluative information, so that the algorithm can find a 'good enough' solution, without exhaustively searching every possible solution.
Dijkstra's Algorithm does not use a heuristic. It expands outwards from the start node, and examines every node in the graph in order to find the shortest path. While this is accurate, it can be computationally expensive.
By comparison, the A* algorithm uses a distance + cost heuristic, to guide the algorithm in its choice of the next node to explore. This means that the algorithm finds a possible search solution without examining every node on the graph. It is therefore much cheaper to run, but at a loss of complete accuracy. It works because the result is usually close enough to the optimal solution, and is found cheaper than an exhaustive search of the entire graph.
As to when you should use each, it really depends on the application. However, using the A* algorithm requires an admissible heuristic, so this may not be applicable in situations where such information is unavailable to the algorithm.
A heuristic basically means an idea or an intuition! Any strategy that you use to solve a hard problem is a heuristic! In some fields (like combinatorial problems) it refers to the strategy that can help you solve a NP hard problem suboptimally within a polynomial time.
Dijkstra solves the single-source routing problem, i.e. it gives you the cost of going froma single point to any other point in the space.
A* solves a single-source single-target problem. It gives you a minimum distance path from a given point to another given point.
A* is usually faster than Dijkstra provided you give it an admissible heuristic, this is, an estimate of the distance to the goal, that never overestimates said distance. Contrary to a previous answer given here, if the heuristic used is admissible, A* is complete and will give you the optimal answer.

Astar-like algorithm with unknown endstate

A-star is used to find the shortest path between a startnode and an endnode in a graph. What algorithm is used to solve something were the target state isn't specifically known and we instead only have a criteria for the target state?
For example, can a sudoku puzzle be solved with an Astar-like algorithm? We dont know how the endstate will look like (which number is where) but we do know the rules of sudoku, a criteria for a winning state. Therefore I have a startnode and just a criteria for the endnode, which algorithm to use?
A* requires a graph, a cost function for traversal of that graph, a heuristic as to whether a node in the graph is closer to the goal than another, and a test whether the goal is reached.
Searching a Sudoku solution space doesn't really have a traversal cost to minimize, only a global cost ( the number of unsolved squares ), so all traversals would be equal cost, so A* doesn't really help - any cell you could assign costs one move and moves you one closer to the goal, so A* would be no better than choosing the next step at random.
It might be possible to apply an A* search based on the estimated/measured cost of applying the different techniques at each point, which would then try to find a path through the solution space with the least computational cost. In that case the graph would not just be the solution states of the puzzle, but you'd be choosing between the techniques to apply at that point - you'd know an estimate of the cost of a transition, but not where that transition 'goes', except that if successful, it's one step closer to the goal.
Yes, A* can be used when a specific goal state cannot be identified. (Pete Kirkham's answer implies this, but doesn't emphasise it much.)
When a specific goal state can't be identified, it's sometimes harder to come up with a useful heuristic lower bound on the remaining cost needed to complete a partial solution -- and the efficiency of A* depends on choosing an effective heuristic. But it doesn't mean it can't be applied. Any problem that can be solved on a computer can be solved using a breadth-first search, plus an array of flags indicating whether a state has been seen before; which is the same as A* with a heuristic lower bound that is always zero. (Of course, this is not the most efficient algorithm for solving many problems.)
You dont have to know the exact target endstate. It all comes down to the heuristic function, when it returns 0 you could assume to have found (at least) one of the valid endstates.
So during the a*, instead of checking if current_node == target_node, check if current_node.h() returns 0. If so, it should be infinitely close and/or overlapping the goal/endstate.

Resources