Should ant colony algorithm show best path in 100% cases? - algorithm

I developed ant colony algorithm. It is working quite good at the moment.
In some moot points it can show not best path, but close to best one.
For example, I have this graph:
Matrix is:
1 2 3 4 5 6 7
1 0 6 5 0 0 2 0
2 6 0 3 2 1 5 0
3 5 3 0 2 5 0 0
4 0 2 2 0 3 0 0
5 0 1 5 3 0 6 0
6 2 5 0 0 6 0 2
7 0 0 0 0 0 2 0
First col and first row are vertex names.
So possible paths are (path - length of the path):
1. 1-2-5 with length 7
2. 1-6-2-5 with length 8
3. 1-6-5 with length 8
My programm is choosing 1st path in 1/10 starts, 2nd path in 7/10 starts and 3rd path in 2/10 start of the programm.
Is it working correct?
Explanation for this is ants has their own eyes (vision, they look at edge length) and also they can detect pheromone level. Own eyes shows for them, that 1-2 edge is rather long and longer then edge 1-6, so in generally they will choose edge 1-6 instead of choosing 1-2. Same for 6-5 and 6-2: 6-2 is more attractive, because it is shorter.
Am I right with my assumption?

According to this: http://en.wikipedia.org/wiki/Ant_colony_optimization_algorithms#Summary , I can see 2 problems in your approach:
ants (initially) wander randomly; it has nothing to do with vision or the adjacent edge length
do you model those pheromone trails at all?
Answering the question: Should ant colony algorithm show best path in 100% cases? No, it doesn't need to show the best path at all.

In ant colony optimization algorithms, the ants have probabilities for each possible step while walking through the graph. Tipically, this probability is based on two factors: a local and a global measure of quality.
The global measure is usually associated with the pheromone deposit in an edge, since pheromone is added to each edge used in the path followed by an ant and the amount added is somehow related to the quality of the solution created by such ant.
The local measure is usually related to the quality of a particular step: the cost of an edge, in the example provided.
Therefore, if your ants are taking only greedy actions, it is possible that the probability function you are using is giving too much weight to local quality. Finding a probability function that exhibit a good compromise between local and global search is a fundamental aspect of a successfully applied ACO strategy.

Why are you using ant colony for shortest path? If you are searching shortest path you don't need optimization algorithm, best solution can be achieved with polynomial time with A* algorithm (with optimal heuristic function). Ant colony is better when you are using it for TSP problem.
And the answer is: no - keep in mind that algorithm is probabilistic so it may not lead to best solution but to local minimum

Related

Issue with min cost path

I was solving min cost path problem through dynamic approach but suddenly I realised that greedy approach is also working.
I applied greedy like this :
choose the min of bottom, right and diagonal cost and move in the min cost path.
1 2 3
4 8 2
1 5 3
where numbers are cost which will be added to the required cost if we include that point.
path from 1 to 3 is 12 through greedy is 8.
If my approach doesn't follow all examples, the what is that example?
How about a map such as:
1 1 1
2 10 10
1 1 1
Your greedy approach will end up taking 1+1+1+10+1, instead of 1+2+1+1
Greedy algorithms can be 'beaten' by giving them a long meandering path with several small steps:
2 2 e
2 ∞ 0
s 3 0
In this case, going from s to e will require either
The greedy solution: See that 2 is smaller and slog through several three 2's for a total cost of 6
A dynamic solution: See that after the three there is an easy path for a total cost of 3.
Also, I'd take a look at how your two algorithms define length. The optimal path is actually 1 -> 4 -> 2 -> 3 which has a cost of 10. If your dynamic solution isn't returning that, it may indicate that something else is going on.

Minimum number of steps to sort 3x3 matrix in a specific way

So I started practicing some algorithms and programming before university starts and I ran into this problem:
Given a 3x3 matrix containing the numbers from 0 to 8, find the minimum number of steps required to sort the matrix in the following format:
1 2 3
4 5 6
7 8 0
In one move it is only allowed to pick a cell that is adjacent to the cell which contains the 0 and swap those two cells.
Now, I am really stuck with this one and have no idea how to begin. Any tips and ideas to get me started are appreciated.
This is not homework if anyone thinks that way, I am just trying to exercise and by moving to tougher problems I got stuck. I am not looking for anyone to write the code for me, I just need a point in the right direction because I really want to understand the algorithm behind this. Thank you.
Note: This is actually an AI problem, and not a trivial data structure/algorithm problem.
This problem is called the n-puzzle problem. The example in your question is the 8-puzzle problem.
The way to solve this problem is by trying to shuffle the boxes in a way that each step gets you closer to your final goal. Think of this as a Greedy approach (Best-first search). The best algorithm to use here is the A* algorithm.
We define a state of the game to be the board position, the number of
moves made to reach the board position, and the previous state. First,
insert the initial state (the initial board, 0 moves, and a null
previous state) into a priority queue. Then, delete from the priority
queue the state with the minimum priority, and insert onto the
priority queue all neighboring states (those that can be reached in
one move). Repeat this procedure until the state dequeued is the goal
state. The success of this approach hinges on the choice of priority
function for a state. We consider two priority functions:
Hamming priority function. The number of blocks in the wrong position, plus the number of moves made so far to get to the state. Intutively, a state with a small number of blocks in the wrong position is close to the goal state, and we prefer a state that have been reached using a small number of moves.
Manhattan priority function. The sum of the distances (sum of the vertical and horizontal distance) from the blocks to their goal positions, plus the number of moves made so far to get to the state.
For example, the Hamming and Manhattan priorities of the initial state
below are 5 and 10, respectively.
8 1 3 1 2 3 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
4 2 4 5 6 ---------------------- ----------------------
7 6 5 7 8 1 1 0 0 1 1 0 1 1 2 0 0 2 2 0 3
initial goal Hamming = 5 + 0 Manhattan = 10 + 0
We make a key oberservation: to solve the puzzle from a given state
on the priority queue, the total number of moves we need to make
(including those already made) is at least its priority, using either
the Hamming or Manhattan priority function. (For Hamming priority,
this is true because each block that is out of place must move at
least once to reach its goal position. For Manhattan priority, this is
true because each block must move its Manhattan distance from its goal
position. Note that we do not count the blank tile when computing the
Hamming or Manhattan priorities.)
Consequently, as soon as we dequeue a state, we have not only
discovered a sequence of moves from the initial board to the board
associated with the state, but one that makes the fewest number of
moves.
(Source)

What is a good individual representation for a closed path planning task using genetic algorithm?

There is a n*n grid and in one of the cells of the grid lies an agent A.
A can travel T number of cells.
Each cell in the grid has some weight and the path for A has to maximize that weight.
A also has to return to its starting position within its traveling range T.
What can be a good individual representation to represent the paths?
Methods I have tried:
Chromosome is a list of coordinates.
Chromosome is a list of directions. Each gene is a direction like up, down, up-right, etc. Path never breaks in the middle.
Problems with both methods is that crossing-over almost always generates invalid paths. Paths become broken in the middle. They don't form a closed path. I can't seem to figure out a good way to represent the individual solution and an appropriate crossing-over method. Please help.
First of all, I would say that this problem is a better fit for other approaches, such as maybe ant colony optimization, greedy approaches that give good enough solutions etc. GAs might not work so well for the exact reason you describe.
However, if you must use GAs, here are two possible models that might be worth investigating:
Severely punish invalid paths by giving invalid moves a cost of -infinity. For example, if your chromosome says go from a cell x to an unreachable cell y, consider the cost of y -infinity. This might be worth combining with a low probability of crossover happening, something like 5% maybe.
Don't do crossover, just do some form of more involved mutation of the offspring.
If you want to get even fancier, this is somewhat similar to the travelling salesman problem, which has a lot of research in relation to genetic algorithms:
http://www.lalena.com/AI/Tsp/
http://www.math.hmc.edu/seniorthesis/archives/2001/kbryant/kbryant-2001-thesis.pdf
You could encode the path as a reference list:
Assume these are your locations (1 2 3 4 5 6 7 8 9)
A subset route of (1 2 3 4 8) could be encoded (1 1 2 1 4).
Now take two parents
p1 = (1 1 2 1 | 4 1 3 1 1)
p2 = (5 1 5 5 | 5 3 3 2 1)
which will produce
o1 = (1 1 2 1 5 3 3 2 1)
o2 = (5 1 5 5 4 1 3 1 1)
which will be decoded into these location routes
o1 = 1 – 2 – 4 – 3 – 9 – 7 – 8 – 6 – 5
o2 = 5 – 1 – 7 – 8 – 6 – 2 – 9 – 3 – 4
This way, a crossover will always yield valid results (whether this representation will help you solving your problem better is a different question).
Some additional information can be found here.

Finding good heuristic for A* search

I'm trying to find the optimal solution for a little puzzle game called Twiddle (an applet with the game can be found here). The game has a 3x3 matrix with the number from 1 to 9. The goal is to bring the numbers in the correct order using the minimum amount of moves. In each move you can rotate a 2x2 square either clockwise or counterclockwise.
I.e. if you have this state
6 3 9
8 7 5
1 2 4
and you rotate the upper left 2x2 square clockwise you get
8 6 9
7 3 5
1 2 4
I'm using a A* search to find the optimal solution. My f() is simply the number of rotations needed. My heuristic function already leads to the optimal solution (if I modify it, see the notice a t the end) but I don't think it's the best one you can find. My current heuristic takes each corner, looks at the number at the corner and calculates the manhatten distance to the position this number will have in the solved state (which gives me the number of rotation needed to bring the number to this postion) and sums all these values. I.e. You take the above example:
6 3 9
8 7 5
1 2 4
and this end state
1 2 3
4 5 6
7 8 9
then the heuristic does the following
6 is currently at index 0 and should by at index 5: 3 rotations needed
9 is currently at index 2 and should by at index 8: 2 rotations needed
1 is currently at index 6 and should by at index 0: 2 rotations needed
4 is currently at index 8 and should by at index 3: 3 rotations needed
h = 3 + 2 + 2 + 3 = 10
Additionally, if h is 0, but the state is not completely ordered, than h = 1.
But there is the problem, that you rotate 4 elements at once. So there a rare cases where you can do two (ore more) of theses estimated rotations in one move. This means theses heuristic overestimates the distance to the solution.
My current workaround is, to simply excluded one of the corners from the calculation which solves this problem at least for my test-cases. I've done no research if really solves the problem or if this heuristic still overestimates in some edge-cases.
So my question is: What is the best heuristic you can come up with?
(Disclaimer: This is for a university project, so this is a bit of homework. But I'm free to use any resource if can come up with, so it's okay to ask you guys. Also I will credit Stackoverflow for helping me ;) )
Simplicity is often most effective. Consider the nine digits (in the rows-first order) as forming a single integer. The solution is represented by the smallest possible integer i(g) = 123456789. Hence I suggest the following heuristic h(s) = i(s) - i(g). For your example, h(s) = 639875124 - 123456789.
You can get an admissible (i.e., not overestimating) heuristic from your approach by taking all numbers into account, and dividing by 4 and rounding up to the next integer.
To improve the heuristic, you could look at pairs of numbers. If e.g. in the top left the numbers 1 and 2 are swapped, you need at least 3 rotations to fix them both up, which is a better value than 1+1 from considering them separately. In the end, you still need to divide by 4. You can pair up numbers arbitrarily, or even try all pairs and find the best division into pairs.
All elements should be taken into account when calculating distance, not just corner elements. Imagine that all corner elements 1, 3, 7, 9 are at their home, but all other are not.
It could be argued that those elements that are neighbors in the final state should tend to become closer during each step, so neighboring distance can also be part of heuristic, but probably with weaker influence than distance of elements to their final state.

Adding waypoints to A* graph search

I have the ability to calculate the best route between a start and end point using A*. Right now, I am including waypoints between my start and end points by applying A* to the pairs in all permutations of my points.
Example:
I want to get from point 1 to point 4. Additionally, I want to pass through points 2 and 3.
I calculate the permutations of (1, 2, 3, 4):
1 2 3 4
1 2 4 3
1 3 2 4
1 3 4 2
1 4 2 3
1 4 3 2
2 1 3 4
2 1 4 3
2 3 1 4
2 3 4 1
2 4 1 3
2 4 3 1
3 1 2 4
3 1 4 2
3 2 1 4
3 2 4 1
3 4 1 2
3 4 2 1
4 1 2 3
4 1 3 2
4 2 1 3
4 2 3 1
4 3 1 2
4 3 2 1
Then, for each permutation, I calculate the A* route from the first to the second, then append it to the route from the second to the third, then the third to the fourth.
When I have this calculated for each permutation, I sort the routes by distance and return the shortest.
Obviously, this works but involves a lot of calculation and totally collapses when I have 6 waypoints (permutations of 8 items is 40320 :-))
Is there a better way to do this?
First of all, you should store all intermediate calculations. Once you calculated the route from 1 to 2, you should never recalculate it again, just look up in a table.
Second, if your graph is undirected, a route from 2 to 1 has exactly the same distance as a route from 1 to 2, so you should not recalculate it either.
And finally, in any case you will have an algorithm that is exponential to the number of points you need to pass. This is very similar to the traveling salesman problem, and it will be exactly this problem if you include all available points. The problem is NP-complete, i.e. it has complexity, exponential to the number of waypoints.
So if you have a lot of points that you must pass, exponential collapse is inevitable.
As a previous answer mentioned, this problem is the NP-complete Traveling Salesperson Problem.
There is a better method than the one you use. The state-of-the-art TSP solver is due to Georgia Tech's Concorde solver. If you can't simply use their freely available program in your own or use their API, I can describe the basic techniques they use.
To solve the TSP, they start with a greedy heuristic called the Lin-Kernighan heuristic to generate an upper bound. Then they use branch-and-cut on a mixed integer programming formulation of the TSP. This means they write a series of linear and integer constraints which, when solved, gives you the optimal path of the TSP. Their inner loop calls a linear programming solver such as Qsopt or Cplex to get a lower bound.
As I mentioned, this is the state-of-the-art so if you're looking for a better way to solve the TSP than what you're doing, here is the best. They can handle over 10,000 cities in a few seconds, especially on the symmmetric, planar TSP (which I suspect is the variant you're working on).
If the number of waypoints you need to eventually handle is small, say on the order of 10 to 15, then you may be able to do a branch-and-bound search using the minimum spanning tree heuristic. This is a textbook exercise in many introductory AI courses. More waypoints than that you will probably outlive the actual running time of the algorithm, and you will have to use Concorde instead.

Resources