Finding good heuristic for A* search - algorithm

I'm trying to find the optimal solution for a little puzzle game called Twiddle (an applet with the game can be found here). The game has a 3x3 matrix with the number from 1 to 9. The goal is to bring the numbers in the correct order using the minimum amount of moves. In each move you can rotate a 2x2 square either clockwise or counterclockwise.
I.e. if you have this state
6 3 9
8 7 5
1 2 4
and you rotate the upper left 2x2 square clockwise you get
8 6 9
7 3 5
1 2 4
I'm using a A* search to find the optimal solution. My f() is simply the number of rotations needed. My heuristic function already leads to the optimal solution (if I modify it, see the notice a t the end) but I don't think it's the best one you can find. My current heuristic takes each corner, looks at the number at the corner and calculates the manhatten distance to the position this number will have in the solved state (which gives me the number of rotation needed to bring the number to this postion) and sums all these values. I.e. You take the above example:
6 3 9
8 7 5
1 2 4
and this end state
1 2 3
4 5 6
7 8 9
then the heuristic does the following
6 is currently at index 0 and should by at index 5: 3 rotations needed
9 is currently at index 2 and should by at index 8: 2 rotations needed
1 is currently at index 6 and should by at index 0: 2 rotations needed
4 is currently at index 8 and should by at index 3: 3 rotations needed
h = 3 + 2 + 2 + 3 = 10
Additionally, if h is 0, but the state is not completely ordered, than h = 1.
But there is the problem, that you rotate 4 elements at once. So there a rare cases where you can do two (ore more) of theses estimated rotations in one move. This means theses heuristic overestimates the distance to the solution.
My current workaround is, to simply excluded one of the corners from the calculation which solves this problem at least for my test-cases. I've done no research if really solves the problem or if this heuristic still overestimates in some edge-cases.
So my question is: What is the best heuristic you can come up with?
(Disclaimer: This is for a university project, so this is a bit of homework. But I'm free to use any resource if can come up with, so it's okay to ask you guys. Also I will credit Stackoverflow for helping me ;) )

Simplicity is often most effective. Consider the nine digits (in the rows-first order) as forming a single integer. The solution is represented by the smallest possible integer i(g) = 123456789. Hence I suggest the following heuristic h(s) = i(s) - i(g). For your example, h(s) = 639875124 - 123456789.

You can get an admissible (i.e., not overestimating) heuristic from your approach by taking all numbers into account, and dividing by 4 and rounding up to the next integer.
To improve the heuristic, you could look at pairs of numbers. If e.g. in the top left the numbers 1 and 2 are swapped, you need at least 3 rotations to fix them both up, which is a better value than 1+1 from considering them separately. In the end, you still need to divide by 4. You can pair up numbers arbitrarily, or even try all pairs and find the best division into pairs.

All elements should be taken into account when calculating distance, not just corner elements. Imagine that all corner elements 1, 3, 7, 9 are at their home, but all other are not.
It could be argued that those elements that are neighbors in the final state should tend to become closer during each step, so neighboring distance can also be part of heuristic, but probably with weaker influence than distance of elements to their final state.

Related

Unable to solve a codechef puzzle?

I am trying to solve this apparently beginner level problem. But not even able to come up with a brute force approach.
Here is the problem statement:
Johnny has some difficulty memorizing the small prime numbers. So, his computer science teacher has asked him to play with the following puzzle game frequently.
The puzzle is a 3x3 board consisting of numbers from 1 to 9. The objective of the puzzle is to swap the tiles until the following final state is reached:
1 2 3
4 5 6
7 8 9
At each step, Johnny may swap two adjacent tiles if their sum is a prime number. Two tiles are considered adjacent if they have a common edge.
Help Johnny to find the shortest number of steps needed to reach the goal state.
Input
The first line contains t, the number of test cases (about 50). Then t test cases follow. Each test case consists of a 3x3 table describing a puzzle which Johnny would like to solve.
The input data for successive test cases is separated by a blank line.
Output
For each test case print a single line containing the shortest number of steps needed to solve the corresponding puzzle. If there is no way to reach the final state, print the number -1.
Example
Input:
2
7 3 2
4 1 5
6 8 9
9 8 5
2 4 1
3 7 6
Output:
6
-1
Here is a start on Python for finding the distance to ALL boards that are reachable from the start.
start = '123456789'
needed_moves = {start: 0}
work_queue = [start]
while (len(work_queue)):
board = work_queue.pop(0)
for next_board in next_move(board):
if next_board not in needed_moves:
needed_moves[next_board] = needed_moves[board] + 1
work_queue.append(next_board)
This is a standard pattern for a breadth first search. Just supply a next_move function and some glue for the tests cases and you're good.
For a depth first search, just switch the queue for a stack. Which really means do work_stack.pop() to pop the last element rather than work_queue.pop(0) to get the first one.

Minimum number of steps to sort 3x3 matrix in a specific way

So I started practicing some algorithms and programming before university starts and I ran into this problem:
Given a 3x3 matrix containing the numbers from 0 to 8, find the minimum number of steps required to sort the matrix in the following format:
1 2 3
4 5 6
7 8 0
In one move it is only allowed to pick a cell that is adjacent to the cell which contains the 0 and swap those two cells.
Now, I am really stuck with this one and have no idea how to begin. Any tips and ideas to get me started are appreciated.
This is not homework if anyone thinks that way, I am just trying to exercise and by moving to tougher problems I got stuck. I am not looking for anyone to write the code for me, I just need a point in the right direction because I really want to understand the algorithm behind this. Thank you.
Note: This is actually an AI problem, and not a trivial data structure/algorithm problem.
This problem is called the n-puzzle problem. The example in your question is the 8-puzzle problem.
The way to solve this problem is by trying to shuffle the boxes in a way that each step gets you closer to your final goal. Think of this as a Greedy approach (Best-first search). The best algorithm to use here is the A* algorithm.
We define a state of the game to be the board position, the number of
moves made to reach the board position, and the previous state. First,
insert the initial state (the initial board, 0 moves, and a null
previous state) into a priority queue. Then, delete from the priority
queue the state with the minimum priority, and insert onto the
priority queue all neighboring states (those that can be reached in
one move). Repeat this procedure until the state dequeued is the goal
state. The success of this approach hinges on the choice of priority
function for a state. We consider two priority functions:
Hamming priority function. The number of blocks in the wrong position, plus the number of moves made so far to get to the state. Intutively, a state with a small number of blocks in the wrong position is close to the goal state, and we prefer a state that have been reached using a small number of moves.
Manhattan priority function. The sum of the distances (sum of the vertical and horizontal distance) from the blocks to their goal positions, plus the number of moves made so far to get to the state.
For example, the Hamming and Manhattan priorities of the initial state
below are 5 and 10, respectively.
8 1 3 1 2 3 1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
4 2 4 5 6 ---------------------- ----------------------
7 6 5 7 8 1 1 0 0 1 1 0 1 1 2 0 0 2 2 0 3
initial goal Hamming = 5 + 0 Manhattan = 10 + 0
We make a key oberservation: to solve the puzzle from a given state
on the priority queue, the total number of moves we need to make
(including those already made) is at least its priority, using either
the Hamming or Manhattan priority function. (For Hamming priority,
this is true because each block that is out of place must move at
least once to reach its goal position. For Manhattan priority, this is
true because each block must move its Manhattan distance from its goal
position. Note that we do not count the blank tile when computing the
Hamming or Manhattan priorities.)
Consequently, as soon as we dequeue a state, we have not only
discovered a sequence of moves from the initial board to the board
associated with the state, but one that makes the fewest number of
moves.
(Source)

Algorithm to find adjacent cells in a matrix

For example, consider a non-wraparound 4x4 matrix;
1 2 5 1
5 2 5 2
9 3 1 7
2 9 0 3
If I wanted to find the neighbours of, say, the 5 in the first row = 2,5,1. Is there a more efficient solution than doing two for loops and adding a bunch of if conditions?
Yes. If you really need to find the neighbors, then you have an option to use graphs.
Graphs are basically vertex classes w/ their adjacent vertexes, forming an edge. We can see here that 2 forms an edge w/ 5, and 1 form an edge w/ 5, etc.
If you're going to need to know the neighbors VERY frequently(because this is inefficient if you're not), then implement your own vertex class, wrapping the value(5) in a generic T val variable. Have a hashtable of adjacent numbers and their respective distances(1 in this case, and if you need to find neighbors of 2, then you're going to need to assign those as well) by add(vertex, distance) into the hashtable.
Later on, simply iterate through the hashtable for the neighbors.
However, for an array this simple, there isn't much overhead for just doing a for loop and using "a bunch of if statements". In reality you only need to have if(boundaries check) for every direction(which is 4).
Hopefully this helps.

Is there better way than a Dijkstra algorithm for finding fastest path that do not exceed specified cost

I'm having a problem with finding fastest path that do not exceed specified cost.
There's similar question to this one, however there's a big difference between them.
Here, the only records that can appear in the data are the ones, that lead from a lower point to a higher point (eg. 1 -> 3 might appear but 3 -> 1 might not) (see below). Without knowing that, I'd use Dijkstra. That additional information, might let it do in a time faster than Dijkstras algorithm.
What do you think about it?
Let's say I've got specified maximum cost, and 4 records.
// specified cost
10
// end point
5
//(start point) (finish point) (time) (cost)
2 5 50 5
3 5 20 9
1 2 30 5
1 3 30 7
// all the records lead from a lower to a higher point no.
I have to decide, whether It's possible to get from point (1) to (5) (its impossible when theres no path that costs <= than we've got or when theres no connection between 1-5) and if so, what would be the fastest way to get in there.
The output for such data would be:
80 // fastest time
3 1 // number of points that (1 -> 2) -> (2 -> 5)
Keep in mind, that if there's a record saying you can move 1->2
1 2 30 5
It doesnt allow you to move 2<-1.
For each node at depth n, the minimum cost of path to it is n/2 * (minimal first edge at the path + minimal edge connecting to the node) - sum of arithmetic series. If this computation exceeds the required maximum, no need to check the nodes that follow. Cut these nodes off and apply Dijkstra on the rest.

plane bombing problems- help

I'm training code problems, and on this one I am having problems to solve it, can you give me some tips how to solve it please.
The problem is taken from here:
https://www.ieee.org/documents/IEEEXtreme2008_Competitition_book_2.pdf
Problem 12: Cynical Times.
The problem is something like this (but do refer to above link of the source problem, it has a diagram!):
Your task is to find the sequence of points on the map that the bomber is expected to travel such that it hits all vital links. A link from A to B is vital when its absence isolates completely A from B. In other words, the only way to go from A to B (or vice versa) is via that link.
Due to enemy counter-attack, the plane may have to retreat at any moment, so the plane should follow, at each moment, to the closest vital link possible, even if in the end the total distance grows larger.
Given all coordinates (the initial position of the plane and the nodes in the map) and the range R, you have to determine the sequence of positions in which the plane has to drop bombs.
This sequence should start (takeoff) and finish (landing) at the initial position. Except for the start and finish, all the other positions have to fall exactly in a segment of the map (i.e. it should correspond to a point in a non-hit vital link segment).
The coordinate system used will be UTM (Universal Transverse Mercator) northing and easting, which basically corresponds to a Euclidian perspective of the world (X=Easting; Y=Northing).
Input
Each input file will start with three floating point numbers indicating the X0 and Y0 coordinates of the airport and the range R. The second line contains an integer, N, indicating the number of nodes in the road network graph. Then, the next N (<10000) lines will each contain a pair of floating point numbers indicating the Xi and Yi coordinates (1 < i<=N). Notice that the index i becomes the identifier of each node. Finally, the last block starts with an integer M, indicating the number of links. Then the next M (<10000) lines will each have two integers, Ak and Bk (1 < Ak,Bk <=N; 0 < k < M) that correspond to the identifiers of the points that are linked together.
No two links will ever cross with each other.
Output
The program will print the sequence of coordinates (pairs of floating point numbers with exactly one decimal place), each one at a line, in the order that the plane should visit (starting and ending in the airport).
Sample input 1
102.3 553.9 0.2
14
342.2 832.5
596.2 638.5
479.7 991.3
720.4 874.8
744.3 1284.1
1294.6 924.2
1467.5 659.6
1802.6 659.6
1686.2 860.7
1548.6 1111.2
1834.4 1054.8
564.4 1442.8
850.1 1460.5
1294.6 1485.1
17
1 2
1 3
2 4
3 4
4 5
4 6
6 7
7 8
8 9
8 10
9 10
10 11
6 11
5 12
5 13
12 13
13 14
Sample output 1
102.3 553.9
720.4 874.8
850.1 1460.5
102.3 553.9
Pre-process the input first, so you identify the choke points. Algorithms like Floyd-Warshall would help you.
Model the problem as a Heuristic Search problem, you can compute a MST which covers all choke-points and take the sum of the costs of the edges as a heuristic.
As the commenters said, try to make concrete questions, either here or to the TA supervising your class.
Don't forget to mention where you got these hints.
The problem can be broken down into two parts.
1) Find the vital links.
These are nothing but the Bridges in the graph described. See the wiki page (linked to in the previous sentence), it mentions an algorithm by Tarjan to find the bridges.
2) Once you have the vital links, you need to find the smallest number of points which given the radius of the bomb, will cover the links. For this, for each link, you create a region around it, where dropping the bomb will destroy it. Now you form a graph of these regions (two regions are adjacent if they intersect). You probably need to find a minimum clique partition in this graph.
Haven't thought it through (especially part 2), but hope it helps.
And good luck in the contest!
I think Moron' is right about the first part, but on the second part...
The problem description does not tell anything about "smallest number of points". It tells that the plane flies to the closest vital link.
So, I think the part 2 will be much simpler:
Find the closest non-hit segment to the current location.
Travel to the closest point on the closest segment.
Bomb the current location (remove all segments intersecting a circle)
Repeat until there are no non-hit vital links left.
This straight-forward algorithm has a complexity of O(N*N), but this should be sufficient considering input constraints.

Resources