TSP Genetic Algorithm - path representation and identical tour problem - algorithm

We are implementing path representation to solve our travelling salesman problem using a genetic algorithm. However, we were wondering how to solve the issue that there might be identical tours in our individuals, but which are recognised by the path representation as different individuals. An example:
Each individual consists of an array, in which the elements are the cities visited in order.
Individual 1:
[1 2 3 4 5]
Individual 2:
[4 5 1 2 3]
You can see that the tour in 1 and 2 are actually identical, only the "starting" location is different.
We see some solutions to this problem, but we were wondering which one would be the best, or if there are best practices to overcome this problem from literature/experiments/....
Solution 1
Sort each individual to see if the individuals are identical:
1. pick an individual
2. shift the elements by 1 element to the right (at the end of the array, elements are placed at the beginning of the array)
3. check if this shift now matches an individual
4. if not, repeat steps 3 to 4
Solution 2
1. At the start if the simulations, choose a fixed starting point of the cities.
2. If the fixed starting point would change (mutation, recombination,...) then
3. Shift the array so that chosen starting point is back on first index.
Solution 3
1. Use the adjacency representation to check which individuals are identical.
2. Pass this information on to the path representation.
3. This is used to "correct" the individuals.
Solution 1 and 2 seem time consuming, although 2 would probably need much less computing time. Solution 3 would need to constantly switch from one to the other representation.
Then there is also the issue that in our example the tour can be read in 2 ways:
[1 2 3 4 5]
is the same as
[5 4 3 2 1]
Again here, are there any best practises the solve this?

Since you need to visit every city and return to the origin city, you can simply fix the origin. That solves your problem of shifted equivalent tours outright.
For the other, less important issue of mirrored tours, you can start by sorting your individuals by cost (which you probably already do), and check any pair of tours with equal costs using a simple palindrome-checking algorithm.

Related

Efficient graph search with brute force

I could not think of a better title.
My problem is this:
I have a list of symbols and each of them is tagged with at least one classification (some of them could have more).
Let's say that I have 4 symbols: [[1,2], 1, 1, 1,] -> the first one can be classified as 1 or 2 and the other 3 are always classified as 1 (instead of numbers you could think of strings such as 'human', 'animal' and so on).
I have some "rules" that can merge symbols and generate new ones. This process generate a graph. In the example above, I have 2 initial states:
[1,1,1,1] and [2,1,1,1] and if I apply some rules on them I obtain 2 distinct graphs (rules such as unify an 1 with an 1 if, or a 2 with an 1 if and so on...).
One solution could be to apply every rule to every initial state and generate the final graph, but I was thinking that some part of the graph could potentially be the same and I would end up recomputing the same thing more than once.
For example, if I have [[1,2,3], 1,....long part disjoint from the previous one], maybe the only differences in these graphs is only on the left part, but the right one is always the same (with a brute force approach, I would always recompute it).
Could you think of a clever way to not recompute the same portion of the graph? Does this problem I have remember you about something that I could read?

Algorithm to find adjacent cells in a matrix

For example, consider a non-wraparound 4x4 matrix;
1 2 5 1
5 2 5 2
9 3 1 7
2 9 0 3
If I wanted to find the neighbours of, say, the 5 in the first row = 2,5,1. Is there a more efficient solution than doing two for loops and adding a bunch of if conditions?
Yes. If you really need to find the neighbors, then you have an option to use graphs.
Graphs are basically vertex classes w/ their adjacent vertexes, forming an edge. We can see here that 2 forms an edge w/ 5, and 1 form an edge w/ 5, etc.
If you're going to need to know the neighbors VERY frequently(because this is inefficient if you're not), then implement your own vertex class, wrapping the value(5) in a generic T val variable. Have a hashtable of adjacent numbers and their respective distances(1 in this case, and if you need to find neighbors of 2, then you're going to need to assign those as well) by add(vertex, distance) into the hashtable.
Later on, simply iterate through the hashtable for the neighbors.
However, for an array this simple, there isn't much overhead for just doing a for loop and using "a bunch of if statements". In reality you only need to have if(boundaries check) for every direction(which is 4).
Hopefully this helps.

Generic solution to towers of Hanoi with variable number of poles?

Given D discs, P poles, and the initial starting positions for the disks, and the required final destination for the poles, how can we write a generic solution to the problem?
For example,
Given D=6 and P=4, and the initial starting position looks like this:
5 1
6 2 4 3
Where the number represents the disk's radius, the poles are numbered 1-4 left-right, and we want to stack all the disks on pole 1.
How do we choose the next move?
The solution is (worked out by hand):
3 1
4 3
4 1
2 1
3 1
(format: <from-pole> <to-pole>)
The first move is obvious, move the "4" on top of the "5" because that its required position in the final solution.
Next, we probably want to move the next largest number, which would be the "3". But first we have to unbury it, which means we should move the "1" next. But how do we decide where to place it?
That's as far as I've gotten. I could write a recursive algorithm that tries all possible places, but I'm not sure if that's optimal.
We can't.
More precisely, as http://en.wikipedia.org/wiki/Tower_of_Hanoi#Four_pegs_and_beyond says, for 4+ pegs, proving what the optimal solution is is an open problem. There is a known very good algorithm, that is widely believed to be optimal, for the simple case where the pile of disks is on one peg and you want to transfer the whole pile to another. However we do not have an algorithm, or even a known heuristic, for an arbitrary starting position.
If we did have a proposed algorithm, then the open problem would presumably be much easier.

Algorithm design to assign nodes to graphs

I have a graph-theoretic (which is also related to combinatorics) problem that is illustrated below, and wonder what is the best approach to design an algorithm to solve it.
Given 4 different graphs of 6 nodes (by different, I mean different structures, e.g. STAR, LINE, COMPLETE, etc), and 24 unique objects, design an algorithm to assign these objects to these 4 graphs 4 times, so that the number of repeating neighbors on the graphs over the 4 assignments is minimized. For example, if object A and B are neighbors on 1 of the 4 graphs in one assignment, then in the best case, A and B will not be neighbors again in the other 3 assignments.
Obviously, the degree to which such minimization can go is dependent on the specific graph structures given. But I am more interested in a general solution here so that given any 4 graph structures, such minimization is guaranteed as the result of the algorithm.
Any suggestion/idea of solving this problem is welcome, and some pseudo-code may well be sufficient to illustrate the design. Thank you.
Representation:
You have 24 elements, I will name this elements from A to X (24 first letters).
Each of these elements will have a place in one of the 4 graphs. I will assign a number to the 24 nodes of the 4 graphs from 1 to 24.
I will identify the position of A by a 24-uple =(xA1,xA2...,xA24), and if I want to assign A to the node number 8 for exemple, I will write (xa1,Xa2..xa24) = (0,0,0,0,0,0,0,1,0,0...0), where 1 is on position 8.
We can say that A =(xa1,...xa24)
e1...e24 are the unit vectors (1,0...0) to (0,0...1)
note about the operator '.':
A.e1=xa1
...
X.e24=Xx24
There are some constraints on A,...X with these notations :
Xii is in {0,1}
and
Sum(Xai)=1 ... Sum(Xxi)=1
Sum(Xa1,xb1,...Xx1)=1 ... Sum(Xa24,Xb24,... Xx24)=1
Since one element can be assign to only one node.
I will define a graph by defining the neighbors relation of each node, lets say node 8 has neighbors node 7 and node 10
to check that A and B are neighbors on node 8 for exemple I nedd:
A.e8=1 and B.e7 or B.e10 =1 then I just need A.e8*(B.e7+B.e10)==1
in the function isNeighborInGraphs(A,B) I test that for every nodes and I get one or zero depending on the neighborhood.
Notations:
4 graphs of 6 nodes, the position of each element is defined by an integer from 1 to 24.
(1 to 6 for first graph, etc...)
e1... e24 are the unit vectors (1,0,0...0) to (0,0...1)
Let A, B ...X be the N elements.
A=(0,0...,1,...,0)=(xa1,xa2...xa24)
B=...
...
X=(0,0...,1,...,0)
Graph descriptions:
IsNeigborInGraphs(A,B)=A.e1*B.e2+...
//if 1 and 2 are neigbors in one graph
for exemple
State of the system:
L(A)=[B,B,C,E,G...] // list of
neigbors of A (can repeat)
actualise(L(A)):
for element in [B,X]
if IsNeigbotInGraphs(A,Element)
L(A).append(Element)
endIf
endfor
Objective functions
N(A)=len(L(A))+Sum(IsneigborInGraph(A,i),i in L(A))
...
N(X)= ...
Description of the algorithm
start with an initial position
A=e1... X=e24
Actualize L(A),L(B)... L(X)
Solve this (with a solveur, ampl for
exemple will work I guess since it's
a nonlinear optimization
problem):
Objective function
min(Sum(N(Z),Z=A to X)
Constraints:
Sum(Xai)=1 ... Sum(Xxi)=1
Sum(Xa1,xb1,...Xx1)=1 ...
Sum(Xa24,Xb24,... Xx24)=1
You get the best solution
4.Repeat step 2 and 3, 3 more times.
If all four graphs are K_6, then the best you can do is choose 4 set partitions of your 24 objects into 4 sets each of cardinality 6 so that the pairwise intersection of any two sets has cardinality at most 2. You can do this by choosing set partitions that are maximally far apart in the Hasse diagram of set partitions with partial order given by refinement. The general case is much harder, but perhaps you can still begin with this crude approximation of a solution and then be clever with which vertex is assigned which object in the four assignments.
Assuming you don't want to cycle all combinations and calculate the sum every time and choose the lowest, you can implement a minimum problem (solved depending on your constraints using either a linear programming solver i.e. symplex algorithm engines or a non-linear solver, much harder talking in terms of time) with constraints on your variables (24) depending on the shape of your path. You can also use free software like LINGO/LINDO to create rapidly a decision theory model and test its correctness (you need decision theory notions though)
If this has anything to do with the real world, then it's unlikely that you absolutely must have a solution that is the true minimum. Close to the minimum should be good enough, right? If so, you could repeatedly randomly make the 4 assignments and check the results until you either run out of time or have a good-enough solution or appear to have stopped improving your best solution.

Finding good heuristic for A* search

I'm trying to find the optimal solution for a little puzzle game called Twiddle (an applet with the game can be found here). The game has a 3x3 matrix with the number from 1 to 9. The goal is to bring the numbers in the correct order using the minimum amount of moves. In each move you can rotate a 2x2 square either clockwise or counterclockwise.
I.e. if you have this state
6 3 9
8 7 5
1 2 4
and you rotate the upper left 2x2 square clockwise you get
8 6 9
7 3 5
1 2 4
I'm using a A* search to find the optimal solution. My f() is simply the number of rotations needed. My heuristic function already leads to the optimal solution (if I modify it, see the notice a t the end) but I don't think it's the best one you can find. My current heuristic takes each corner, looks at the number at the corner and calculates the manhatten distance to the position this number will have in the solved state (which gives me the number of rotation needed to bring the number to this postion) and sums all these values. I.e. You take the above example:
6 3 9
8 7 5
1 2 4
and this end state
1 2 3
4 5 6
7 8 9
then the heuristic does the following
6 is currently at index 0 and should by at index 5: 3 rotations needed
9 is currently at index 2 and should by at index 8: 2 rotations needed
1 is currently at index 6 and should by at index 0: 2 rotations needed
4 is currently at index 8 and should by at index 3: 3 rotations needed
h = 3 + 2 + 2 + 3 = 10
Additionally, if h is 0, but the state is not completely ordered, than h = 1.
But there is the problem, that you rotate 4 elements at once. So there a rare cases where you can do two (ore more) of theses estimated rotations in one move. This means theses heuristic overestimates the distance to the solution.
My current workaround is, to simply excluded one of the corners from the calculation which solves this problem at least for my test-cases. I've done no research if really solves the problem or if this heuristic still overestimates in some edge-cases.
So my question is: What is the best heuristic you can come up with?
(Disclaimer: This is for a university project, so this is a bit of homework. But I'm free to use any resource if can come up with, so it's okay to ask you guys. Also I will credit Stackoverflow for helping me ;) )
Simplicity is often most effective. Consider the nine digits (in the rows-first order) as forming a single integer. The solution is represented by the smallest possible integer i(g) = 123456789. Hence I suggest the following heuristic h(s) = i(s) - i(g). For your example, h(s) = 639875124 - 123456789.
You can get an admissible (i.e., not overestimating) heuristic from your approach by taking all numbers into account, and dividing by 4 and rounding up to the next integer.
To improve the heuristic, you could look at pairs of numbers. If e.g. in the top left the numbers 1 and 2 are swapped, you need at least 3 rotations to fix them both up, which is a better value than 1+1 from considering them separately. In the end, you still need to divide by 4. You can pair up numbers arbitrarily, or even try all pairs and find the best division into pairs.
All elements should be taken into account when calculating distance, not just corner elements. Imagine that all corner elements 1, 3, 7, 9 are at their home, but all other are not.
It could be argued that those elements that are neighbors in the final state should tend to become closer during each step, so neighboring distance can also be part of heuristic, but probably with weaker influence than distance of elements to their final state.

Resources