Path finding Algorithms, Stuck on basics - algorithm

This is a basic college puzzle question i have which i though i will nail, but when it comes to implementing it i got stuck right on the get go.
Background: a 4 digit path finding algorithm, so given for example a target number 1001 and target number 1040, find path using BFs,DFs...Etc.
Allowed moves: add or subtract 1 from any of the 4 digits in any order, you cannot add to 9, or subtract from 0. You cannot add or subtract from the same digit twice i.e. to get from 1000 to 3000, you cannot do 2000 ->3000 as you’ll be adding 1 twice to the first digit. So you'll have to do 2000->2100->3100->3000.
Where I am Stuck: I just cannot figure out how i will represent the problem programmatically, never mind the algorithms, but how i will represent 1001 and move towards 1004 .What data structure? How will I identify the neighbours?
Can someone please push me towards the right direction?

If I understand correctly, you're looking for a way to represent your problem. Perhaps you're having difficulties to find it, because the problem uses a 4 digit number. Try to reduce the dimensions, and consider the same problem with 2 digits numbers (see also this other answer). Suppose you want to move from 10 to 30 using the problem's rules. Now that you have two dimensions, you may easily represent your problem on a 10x10 matrix (each digit can have 10 different values):
:
In the picture above S is your source position (10), and T is your target position (30). Now you can apply the rules of the problem. I don't know what you want to find: how many paths? just a path? the shortest path?
Anyway, a sketch of algorithm for handling moves is:
if you look at T you see that according to the rules of your problem, the only possible neighbours from which you can reach (30) are (20), (31), and (40).
now, let's assume you choose (20), then the only possible neighbour from which you can reach (20) is (21).
now, you're forced to choose (21), and the only possible neighbours from which you can reach (21) are (11) and (31).
finally, if you choose (11), then the only possible neighbours from which you can reach (11) are (10) and (12) ... and (10) is your source S, so you're done.
This is just a sketch for the algorithm (it doesn't say anything on how to choose from the list of possible neighbours), but I hope it gives you an idea.
Finally, when you've solved the problem in the two-dimensional space, you can extend it to a three-dimensional space, and to a four-dimensional space (your original problem), where you have a 10x10x10x10 matrix. Each dimension has always 10 slots/positions.
I hope this helps.

I would start with a directed graph where each node represents your number plus an additional degree of freedom (because of the constraint). E.g. you can think of this as another digit added to your number which represents the "digit changed last". E.g. you start with 1001[0] and this node connects to
1001[0] -> 2001[1]
1001[0] -> 0001[1]
1001[0] -> 1101[2]
1001[0] -> 1011[3]
1001[0] -> 1002[4]
1001[0] -> 0000[4]
or 2001[1] to
2001[1] -> 2101[2]
2001[1] -> 2011[3]
2001[1] -> 2002[4]
2001[1] -> 2000[4]
Please note: the ????[0] nodes are only allowed as starting nodes. And each edge always connects numbers with different last digit.
So for each number xyzw[d] the outgoing edges are given by
xyzw[d] -> (x+1)yzw[1] if x<9 && d!=1
xyzw[d] -> (x-1)yzw[1] if x>0 && d!=1
xyzw[d] -> x(y+1)zw[2] if y<9 && d!=2
xyzw[d] -> x(y-1)zw[2] if y>0 && d!=2
xyzw[d] -> xy(z+1)w[3] if z<9 && d!=3
xyzw[d] -> xy(z-1)w[3] if z>0 && d!=3
xyzw[d] -> xyz(w+1)[4] if w<9 && d!=4
xyzw[d] -> xyz(w-1)[4] if w>0 && d!=4
Your targets are now the four nodes 1004[1] .. 1004[4]. If your algorithm requires a single target node you may add an artificial edge ????[?] -> ????[0] for each number and then have 1004[0] as final.

The first thing that came to my mind is that this is just a point with extra dimensions. That is, rather than [x,y] to represent a point on a 2-D grid, you have [w,x,y,z] to represent a point in a 4-D space.
The one thing I'm not sure in your problem is what happens when you add a 1 to a 9. Is there a remainder that get's carried over, or does the number just get rolled over to zero? Or does this represent an edge that can't be crossed?
In any case, I'd start with thinking of this as a type of point with extra dimensionality, and apply your algorithms appropriately.

You can represent each state as an integer array of length 4. Then have a graph structure to connect the states (I would not advice on building the graph fully, as this alone will require O(10^4) memory).
Then use A* algorithm for traversing the graph and reaching the destination node.
Or simply use a BFS traversal routine where you generate the neighbouring states as you process a state. You'll have to keep a track of the states already visited (some sort of a set like data structure).

When visit a node you need to iterate through all children of the node and visit some of them, so you should only implement children generator function getting one parameter - current number like 1001. It's the "core" of the algorithm, then you need to use recoursion to traverse a tree. The following is code that DOES NOT WORK and goes into infinite recoursion, but is a sketch explaining my idea:
from random import random
def shouldVisit(digits):
#very smart function defining search strategy
return random() > 0.5
def traverse(source, target, prev):
if source == target:
print 'wow!'
return
for i in range(len(source)):
if i == prev:
continue
d = source[i]
if d > 0:
res = list(source)
res[i] = d - 1
if shouldVisit(res):
traverse(res, target, i)
if d < 9:
res = list(source)
res[i] = d + 1
if shouldVisit(res):
traverse(res, target, i)
def traverse0(source, target):
def tolist(num):
digits = []
while num:
digits.append(num % 10)
num /= 10
return digits
traverse(tolist(source), tolist(target), -1)
traverse0(1001, 1040)
very smart stub function shouldVisit may check for example if given node has been already visited

Related

Dijkstra's algorithm: is my implementation flawed?

In order to train myself both in Python and graph theory, I tried to implement the Dijkstra algo using Python 3, and submitted it against several online judges, to see if it was correct.
It works well in many cases, but not always.
For example, I am stuck with this one: the test case works fine and I also have tried custom test cases of my own, but when I submit the following solution, the judge keeps telling me "wrong answer", and the expected result is very different from my output, indeed.
Notice that the judge tests it against quite a complex graph (10000 nodes with 100000 edges), while all the cases I tried before never exceeded 20 nodes and around 20-40 edges.
Here is my code.
Given al an adjacency list in the following form:
al[n] = [(a1, w1), (a2, w2), ...]
where
n is the node id;
a1, a2, etc. are its adjacent nodes and w1, w2, etc. the respective weights for the given edge;
and supposing that maximum distance never exceeds 1 billion, I implemented Dijkstra's algorithm this way:
import queue
distance = [1000000000] * (N+1) # this is the array where I store the shortest path between 1 and each other node
distance[1] = 0 # starting from node 1 with distance 0
pq = queue.PriorityQueue()
pq.put((0, 1)) # same as above
visited = [False] * (N+1)
while not pq.empty():
n = pq.get()[1]
if visited[n]:
continue
visited[n] = True
for edge in al[n]:
if distance[edge[0]] > distance[n] + edge[1]:
distance[edge[0]] = distance[n] + edge[1]
pq.put((distance[edge[0]], edge[0]))
Could you please help me understand wether my implementation is flawed, or if I simply ran into some bugged online judge?
Thank you very much.
UPDATE
As requested, I'm providing the snippet I use to populate the adjacency list al for the linked problem.
N,M = input().split()
N,M = int(N), int(M)
al = [[] for n in range(N+1)]
for m in range(M):
try:
a,b,w = input().split()
a,b,w = int(a), int(b), int(w)
al[a].append((b, w))
al[b].append((a, w))
except:
pass
(Please don't mind the ugly "except: pass", I was using it just for debugging purposes... :P)
Primary problem in interpreting the question:
According to your parsing code, you are treating the input data as an undirected graph, i.e. each edge from A to B also is an edge from B to A.
Is seems like this premise is not valid and it should instead be a directed graph, i.e. you have to remove this line:
al[b].append((a, w)) # no back reference!
Previous problem, now already fixed in the code:
Currently, you are using the never-changing weight of the edges in your queue:
pq.put((edge[1], edge[0]))
This way, the nodes always end up at the same position in the queue, no matter at what stage of the algorithm and how far the path to reach that node actually is.
Instead, you should use the new distance to the target node edge[0], i.e. distance[edge[0]] as the priority in the queue:
pq.put((distance[edge[0]], edge[0]))

Algorithm to find streets and same kind in a hand

This is actually a Mahjong-based question, but a Romme- or even Poker-based background will also easily suffice to understand.
In Mahjong 14 tiles (tiles are like cards in Poker) are arranged to 4 sets and a pair. A street ("123") always uses exactly 3 tiles, not more and not less. A set of the same kind ("111") consists of exactly 3 tiles, too. This leads to a sum of 3 * 4 + 2 = 14 tiles.
There are various exceptions like Kan or Thirteen Orphans that are not relevant here. Colors and value ranges (1-9) are also not important for the algorithm.
I'm trying to determine if a hand can be arranged in the way described above. For certain reasons it should not only be able to deal with 14 but any number of tiles. (The next step would be to find how many tiles need to be exchanged to be able to complete a hand.)
Examples:
11122233344455 - easy enough, 4 sets and a pair.
12345555678999 - 123, 456, 789, 555, 99
11223378888999 - 123, 123, 789, 888, 99
11223344556789 - not a valid hand
My current and not yet implemented idea is this: For each tile, try to make a) a street b) a set c) a pair. If none works (or there would be > 1 pair), go back to the previous iteration and try the next option, or, if this is the highest level, fail. Else, remove the used tiles from the list of remaining tiles and continue with the next iteration.
I believe this approach works and would also be reasonably fast (performance is a "nice bonus"), but I'm interested in your opinion on this. Can you think of alternate solutions? Does this or something similar already exist?
(Not homework, I'm learning to play Mahjong.)
The sum of the values in a street and in a set can be divided by 3:
n + n + n = 3n
(n-1) + n + (n + 1) = 3n
So, if you add together all the numbers in a solved hand, you would get a number of the form 3N + 2M where M is the value of the tile in the pair. The remainder of the division by three (total % 3) is, for each value of M :
total % 3 = 0 -> M = {3,6,9}
total % 3 = 1 -> M = {2,5,8}
total % 3 = 2 -> M = {1,4,7}
So, instead of having to test nine possible pairs, you only have to try three based on a simple addition. For each possible pair, remove two tiles with that value and move on to the next step of the algorithm to determine if it's possible.
Once you have this, start with the lowest value. If there are less than three tiles with that value, it means they're necessarily the first element of a street, so remove that street (if you can't because tiles n+1 or n+2 are missing, it means the hand is not valid) and move on to the next lowest value.
If there are at least three tiles with the lowest value, remove them as a set (if you ask "what if they were part of a street?" consider that if they were, then there are also three of tile n+1 and three of tile n+2, which can also be turned into sets) and continue.
If you reach an empty hand, the hand is valid.
For example, for your invalid hand the total is 60, which means M = {3,6,9}:
Remove the 3: 112244556789
- Start with 1: there are less than three, so remove a street
-> impossible: 123 needs a 3
Remove the 6: impossible, there is only one
Remove the 9: impossible, there is only one
With your second example 12345555678999, the total is 78, which means M = {3,6,9}:
Remove the 3: impossible, there is only one
Remove the 6: impossible, there is only one
Remove the 9: 123455556789
- Start with 1: there is only one, so remove a street
-> 455556789
- Start with 4: there is only one, so remove a street
-> 555789
- Start with 5: there are three, so remove a set
-> 789
- Start with 7: there is only one, so remove a street
-> empty : hand is valid, removals were [99] [123] [456] [555] [789]
Your third example 11223378888999 also has a total of 78, which causes backtracking:
Remove the 3: 11227888899
- Start with 1: there are less than three, so remove a street
-> impossible: 123 needs a 3
Remove the 6: impossible, there are none
Remove the 9: 112233788889
- Start with 1: there are less than three, so remove streets
-> 788889
- Start with 7: there is only one, so remove a street
-> 888
- Start with 8: there are three, so remove a set
-> empty, hand is valid, removals were : [99] [123] [123] [789] [888]
There is a special case that you need to do some re-work to get it right. This happens when there is a run-of-three and a pair with the same value (but in different suit).
Let b denates bamboo, c donates character, and d donates dot, try this hand:
b2,b3,b4,b5,b6,b7,c4,c4,c4,d4,d4,d6,d7,d8
d4,d4 should serve as the pair, and c4,c4,c4 should serve as the run-of-3 set.
But because the 3 "c4" tiles appear before the 2 d4 tiless, the first 2 c4 tiles will be picked up as the pair, leaving an orphan c4 and 2 d4s, and these 3 tiles won't form a valid set.
In this case, you'll need to "return" the 2 c4 tiles back to the hand (and keep the hand sorted), and search for next tile that meets the criteria (value == 4). To do that you'll need to make the code "remember" that it had tried c4 so in next iteration it should skip c4 and looks for other tiles with value == 4. The code will be a bit messy, but doable.
I would break it down into 2 steps.
Figure out possible combinations. I think exhaustive checking is feasible with these numbers. The result of this step is a list of combinations, where each combination has a type (set, street, or pair) and a pattern with the cards used (could be a bitmap).
With the previous information, determine possible collections of combinations. This is where a bitmap would come in handy. Using bitwise operators, you could see overlaps in usage of the same tile for different combinators.
You could also do a step 1.5 where you just check to see if enough of each type is available. This step and step 2 would be where you would be able to create a general algorithm. The first step would be the same for all numbers of tiles and possible combinations quickly.

Finding the longest road in a Settlers of Catan game algorithmically

I'm writing a Settlers of Catan clone for a class. One of the extra credit features is automatically determining which player has the longest road. I've thought about it, and it seems like some slight variation on depth-first search could work, but I'm having trouble figuring out what to do with cycle detection, how to handle the joining of a player's two initial road networks, and a few other minutiae. How could I do this algorithmically?
For those unfamiliar with the game, I'll try to describe the problem concisely and abstractly: I need to find the longest possible path in an undirected cyclic graph.
This should be a rather easy algorithm to implement.
To begin with, separate out the roads into distinct sets, where all the road segments in each set are somehow connected. There's various methods on doing this, but here's one:
Pick a random road segment, add it to a set, and mark it
Branch out from this segment, ie. follow connected segments in both directions that aren't marked (if they're marked, we've already been here)
If found road segment is not already in the set, add it, and mark it
Keep going from new segments until you cannot find any more unmarked segments that are connected to those currently in the set
If there's unmarked segments left, they're part of a new set, pick a random one and start back at 1 with another set
Note: As per the official Catan Rules, a road can be broken if another play builds a settlement on a joint between two segments. You need to detect this and not branch past the settlement.
Note, in the above, and following, steps, only consider the current players segments. You can ignore those other segments as though they weren't even on the map.
This gives you one or more sets, each containing one or more road segments.
Ok, for each set, do the following:
Pick a random road segment in the set that has only one connected road segment out from it (ie. you pick an endpoint)
If you can't do that, then the whole set is looping (one or more), so pick a random segment in this case
Now, from the segment you picked, do a recursive branching out depth-first search, keeping track of the length of the current road you've found so far. Always mark road segments as well, and don't branch into segments already marked. This will allow the algorithm to stop when it "eats its own tail".
Whenever you need to backtrack, because there are no more branches, take a note of the current length, and if it is longer than the "previous maximum", store the new length as the maximum.
Do this for all the sets, and you should have your longest road.
A simple polynomial-time depth-first search is unlikely to work, since the problem is NP-hard. You will need something that takes exponential time to get an optimal solution. Since the problem is so small, that should be no problem in practice, though.
Possibly the simplest solution would be dynamic programming: keep a table T[v, l] that stores for each node v and each length l the set of paths that have length l and end in v. Clearly T[v, 1] = {[v]} and you can fill out T[v, l] for l > 1 by collecting all paths from T[w, l-1] where w is a neighbor of v that do not already contain v, and then attaching v. This is similar to Justin L.'s solution, but avoids some duplicate work.
I would suggest a breadth-first search.
For each player:
Set a default known maximum of 0
Pick a starting node. Skip if it has zero or more than one connected neighbors and find all of the player's paths starting from it in a breadth-first manner. The catch: Once a node has been traversed to, it is "deactivated" for the all searches left in that turn. That is, it may no longer be traversed.
How can this be implemented? Here's one possible breadth-first algorithm that can be used:
If there are no paths from this initial node, or more than one path, mark it deactivated and skip it.
Keep a queue of paths.
Add a path containing only the initial dead-end node to the queue. Deactivate this node.
Pop the first path out of the queue and "explode" it -- that is, find all valid paths that are the the path + 1 more step in a valid direction. By "valid", the next node must be connected to the last one by a road, and it also must be activated.
Deactivate all nodes stepped to during the last step.
If there are no valid "explosions" of the previous path, then compare that length of that path to the known maximum. If greater than it, it is the new maximum.
Add all exploded paths, if any, back into the queue.
Repeat 4-7 until the queue is empty.
After you do this once, move onto the next activated node and start the process all over again. Stop when all nodes are deactivated.
The maximum you have now is your longest road length, for the given player.
Note that this is slightly inefficient, but if performance doesn't matter, then this would work :)
IMPORTANT NOTE, thanks to Cameron MacFarland
Assume all nodes with cities that do not belong to the current player automatically deactivated always.
Ruby pseudocode (assumes an get_neighbors function for each node)
def explode n
exploded = n.get_neighbors # get all neighbors
exploded.select! { |i| i.activated? } # we only want activated ones
exploded.select! { |i| i.is_connected_to(n) } # we only want road-connected ones
return exploded
end
max = 0
nodes.each do |n| # for each node n
next if not n.activated? # skip if node is deactivated
if explode(n).empty? or explode(n).size > 1
n.deactivate # deactivate and skip if
next # there are no neighbors or
end # more than one
paths = [ [n] ] # start queue
until paths.empty? # do this until the queue is empty
curr_path = paths.pop # pop a path from the queue
exploded = explode(curr_path) # get all of the exploded valid paths
exploded.each { |i| i.deactivate } # deactivate all of the exploded valid points
if exploded.empty? # if no exploded paths
max = [max,curr_path.size].max # this is the end of the road, so compare its length to
# the max
else
exploded.each { |i| paths.unshift(curr_path.clone + i) }
# otherwise, add the new paths to the queue
end
end
end
puts max
Little late, but still relevant. I implemented it in java, see here. Algorithm goes as follows:
Derive a graph from the main graph using all edges from a player, adding the vertices of the main graph connected to the edges
Make a list of ends (vertices where degree==1) and splits (vertices where degree==3)
Add a route to check (possibleRoute) for every end, and for every two edges + one vertex combination found (3, since degree is 3) at every split
Walk every route. Three things can be encountered: an end, intermediate or a split.
End: Terminate possibleRoute, add it to the found list
Intermediate: check if connection to other edge is possible. If so, add the edge. If not, terminate the route and add it to the found routes list
Split: For both edges, do as intermediate. When both routes connect, copy the second route and add it to the list, too.
Checking for a connection is done using two methods: see if the new edge already exists in the possibleRoute (no connection), and asking the new edge if it can connect by giving the vertex and the originating vertex as parameters. A road can connect to a road, but only if the vertex does not contain a city/settlement from another player. A road can connect to a ship, but only if the vertex holds a city or road from the player whom route is being checked.
Loops may exist. Every encountered edge is added to a list if not already in there. When all possibleRoutes are checked, but the amount of edges in this list is less then the total amount of edges of the player, one or more loops exist. When this is the case, any nonchecked edge is made a new possibleRoute from, and is being checked again for the route.
Length of longest route is determined by simply iterating over all routes. The first route encountered having equal or more then 5 edges is returned.
This implementation supports most variants of Catan. The edges can decide for themselves if they want to connect to another, see
SidePiece.canConnect(point, to.getSidePiece());
A road, ship, bridge have SidePiece interface implemented. A road has as implementation
public boolean canConnect(GraphPoint graphPoint, SidePiece otherPiece)
{
return (player.equals(graphPoint.getPlayer()) || graphPoint.getPlayer() == null)
&& otherPiece.connectsWithRoad();
}
What I'd do:
Pick a vertex with a road
Pick at most two roads to follow
Follow the the road; backtrack here if it branches, and choose that which was longer
If current vertex is in the visited list, backtrack to 3
Add the vertex to the visited list, recurse from 3
If there is no more road at 3, return the length
When you've followed at most 2 roads, add them up, note the length
If there were 3 roads at the initial vertex, backtrack to 2.
Sorry for the prolog-ish terminology :)
Here is my version if anyone needs it. Written in Typescript.
longestRoadLengthForPlayer(player: PlayerColors): number {
let longestRoad = 0
let mainLoop = (currentLongestRoad: number, tileEdge: TileEdge, passedCorners: TileCorner[], passedEdges: TileEdge[]) => {
if (!(passedEdges.indexOf(tileEdge) > -1) && tileEdge.owner == player) {
passedEdges.push(tileEdge)
currentLongestRoad++
for (let endPoint of tileEdge.hexEdge.endPoints()) {
let corner = this.getTileCorner(endPoint)!
if ((corner.owner == player || corner.owner == PlayerColors.None) && !(passedCorners.indexOf(corner) > -1)) {
passedCorners.push(corner)
for (let hexEdge of corner.hexCorner.touchingEdges()) {
let edge = this.getTileEdge(hexEdge)
if (edge != undefined && edge != tileEdge) {
mainLoop(currentLongestRoad, edge, passedCorners, passedEdges)
}
}
} else {
checkIfLongestRoad(currentLongestRoad)
}
}
} else {
checkIfLongestRoad(currentLongestRoad)
}
}
for (let tileEdge of this.tileEdges) {
mainLoop(0, tileEdge, [], [])
}
function checkIfLongestRoad(roadLength: number) {
if (roadLength > longestRoad) {
longestRoad = roadLength
}
}
return longestRoad
}
You could use Dijkstra and just change the conditions to choose the longest path instead.
http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
It's efficient, but won't always find the path that in reality is shortest/longest although it will work most of the times.

Trying to build algorithm for optimal tower placement in a game

This is going to be a long post and just for fun, so if you don't have much time better go help folks with more important questions instead :)
There is a game called "Tower Bloxx" recently released on xbox. One part of the game is to place different colored towers on a field in a most optimal way in order to maximize number of most valuable towers. I wrote an algorithm that would determine the most efficient tower placement but it is not very efficient and pretty much just brute forcing all possible combinations. For 4x4 field with 4 tower types it solves it in about 1 hr, 5 tower types would take about 40 hours which is too much.
Here are the rules:
There are 5 types of towers that could be placed on a field. There are several types of fields, the easiest one is just 4x4 matrix, others fields have some "blanks" where you can't build. Your aim is to put as many the most valuable towers on a field as possible to maximize total tower value on a field (lets assume that all towers are built at once, there is no turns).
Tower types (in order from less to most valuable):
Blue - can be placed anywhere, value = 10
Red - can be placed only besides blue, value = 20
Green - placed besides red and blue, value = 30
Yellow - besides green, red and blue, value = 40
White - besides yellow, green, red and blue, value = 100
Which means that for example green tower should have at least 1 red and 1 blue towers at either north, south, west or east neighbor cells (diagonals don't count). White tower should be surrounded with all other colors.
Here is my algorithm for 4 towers on 4x4 field:
Total number of combinations = 4^16
Loop through [1..4^16] and convert every number to base4 string in order to encode tower placement, so 4^16 = "3333 3333 3333 3333" which would represent our tower types (0=blue,...,3=yellow)
Convert tower placement string into matrix.
For every tower in a matrix check its neighbors and if any of requirements fails this whole combination fails.
Put all correct combinations into an array and then sort this array as strings in lexicographic order to find best possible combination (first need to sort characters in a string).
The only optimization I came up with is to skip combinations that don't contain any most valuable towers. It skips some processing but I still loop through all 4^16 combinations.
Any thought how this can be improved? Code samples would be helpful if in java or php.
-------Update--------
After adding more illegal states (yellow cannot be built in the corners, white cannot be built in corners and on the edges, field should contain at least one tower of each type), realizing that only 1 white tower could be possibly built on 4x4 field and optimizing java code the total time was brought down from 40 to ~16 hours. Maybe threading would bring it down to 10 hrs but that's probably brute forcing limit.
I found this question intriguing, and since I'm teaching myself Haskell, I decided to try my hand at implementing a solution in that language.
I thought about branch-and-bound, but couldn't come up with a good way to bound the solutions, so I just did some pruning by discarding boards that violate the rules.
My algorithm works by starting with an "empty" board. It places each possible color of tower in the first empty slot and in each case (each color) then recursively calls itself. The recursed calls try each color in the second slot, recursing again, until the board is full.
As each tower is placed, I check the just-placed tower and all of it's neighbors to verify that they're obeying the rules, treating any empty neighbors as wild cards. So if a white tower has four empty neighbors, I consider it valid. If a placement is invalid, I do not recurse on that placement, effectively pruning the entire tree of possibilities under it.
The way the code is written, I generate a list of all possible solutions, then look through the list to find the best one. In actuality, thanks to Haskell's lazy evaluation, the list elements are generated as the search function needs them, and since they're never referred to again they become available for garbage collection right away, so even for a 5x5 board memory usage is fairly small (2 MB).
Performance is pretty good. On my 2.1 GHz laptop, the compiled version of the program solves the 4x4 case in ~50 seconds, using one core. I'm running a 5x5 example right now to see how long it will take. Since functional code is quite easy to parallelize, I'm also going to experiment with parallel processing. There's a parallelized Haskell compiler that will not only spread the work across multiple cores, but across multiple machines as well, and this is a very parallelizable problem.
Here's my code so far. I realize that you specified Java or PHP, and Haskell is quite different. If you want to play with it, you can modify the definition of the variable "bnd" just above the bottom to set the board size. Just set it to ((1,1),(x, y)), where x and y are the number of columns and rows, respectively.
import Array
import Data.List
-- Enumeration of Tower types. "Empty" isn't really a tower color,
-- but it allows boards to have empty cells
data Tower = Empty | Blue | Red | Green | Yellow | White
deriving(Eq, Ord, Enum, Show)
type Location = (Int, Int)
type Board = Array Location Tower
-- towerScore omputes the score of a single tower
towerScore :: Tower -> Int
towerScore White = 100
towerScore t = (fromEnum t) * 10
-- towerUpper computes the upper bound for a single tower
towerUpper :: Tower -> Int
towerUpper Empty = 100
towerUpper t = towerScore t
-- boardScore computes the score of a board
boardScore :: Board -> Int
boardScore b = sum [ towerScore (b!loc) | loc <- range (bounds b) ]
-- boardUpper computes the upper bound of the score of a board
boardUpper :: Board -> Int
boardUpper b = sum [ bestScore loc | loc <- range (bounds b) ]
where
bestScore l | tower == Empty =
towerScore (head [ t | t <- colors, canPlace b l t ])
| otherwise = towerScore tower
where
tower = b!l
colors = reverse (enumFromTo Empty White)
-- Compute the neighbor locations of the specified location
neighborLoc :: ((Int,Int),(Int,Int)) -> (Int,Int) -> [(Int,Int)]
neighborLoc bounds (col, row) = filter valid neighborLoc'
where
valid loc = inRange bounds loc
neighborLoc' = [(col-1,row),(col+1,row),(col,row-1),(col,row+1)]
-- Array to store all of the neighbors of each location, so we don't
-- have to recalculate them repeatedly.
neighborArr = array bnd [(loc, neighborLoc bnd loc) | loc <- range bnd]
-- Get the contents of neighboring cells
neighborTowers :: Board -> Location -> [Tower]
neighborTowers board loc = [ board!l | l <- (neighborArr!loc) ]
-- The tower placement rule. Yields a list of tower colors that must
-- be adjacent to a tower of the specified color.
requiredTowers :: Tower -> [Tower]
requiredTowers Empty = []
requiredTowers Blue = []
requiredTowers Red = [Blue]
requiredTowers Green = [Red, Blue]
requiredTowers Yellow = [Green, Red, Blue]
requiredTowers White = [Yellow, Green, Red, Blue]
-- cellValid determines if a cell satisfies the rule.
cellValid :: Board -> Location -> Bool
cellValid board loc = null required ||
null needed ||
(length needed <= length empties)
where
neighbors = neighborTowers board loc
required = requiredTowers (board!loc)
needed = required \\ neighbors
empties = filter (==Empty) neighbors
-- canPlace determines if 'tower' can be placed in 'cell' without
-- violating the rule.
canPlace :: Board -> Location -> Tower -> Bool
canPlace board loc tower =
let b' = board // [(loc,tower)]
in cellValid b' loc && and [ cellValid b' l | l <- neighborArr!loc ]
-- Generate a board full of empty cells
cleanBoard :: Array Location Tower
cleanBoard = listArray bnd (replicate 80 Empty)
-- The heart of the algorithm, this function takes a partial board
-- (and a list of empty locations, just to avoid having to search for
-- them) and a score and returns the best board obtainable by filling
-- in the partial board
solutions :: Board -> [Location] -> Int -> Board
solutions b empties best | null empties = b
solutions b empties best =
fst (foldl' f (cleanBoard, best) [ b // [(l,t)] | t <- colors, canPlace b l t ])
where
f :: (Board, Int) -> Board -> (Board, Int)
f (b1, best) b2 | boardUpper b2 <= best = (b1, best)
| otherwise = if newScore > lstScore
then (new, max newScore best)
else (b1, best)
where
lstScore = boardScore b1
new = solutions b2 e' best
newScore = boardScore new
l = head empties
e' = tail empties
colors = reverse (enumFromTo Blue White)
-- showBoard converts a board to a printable string representation
showBoard :: Board -> String
showBoard board = unlines [ printRow row | row <- [minrow..maxrow] ]
where
((mincol, minrow), (maxcol, maxrow)) = bounds board
printRow row = unwords [ printCell col row | col <- [mincol..maxcol] ]
printCell col row = take 1 (show (board!(col,row)))
-- Set 'bnd' to the size of the desired board.
bnd = ((1,1),(4,4))
-- Main function generates the solutions, finds the best and prints
-- it out, along with its score
main = do putStrLn (showBoard best); putStrLn (show (boardScore best))
where
s = solutions cleanBoard (range (bounds cleanBoard)) 0
best = s
Also, please remember this is my first non-trivial Haskell program. I'm sure it can be done much more elegantly and succinctly.
Update: Since it was still very time-consuming to do a 5x5 with 5 colors (I waited 12 hours and it hadn't finished), I took another look at how to use bounding to prune more of the search tree.
My first approach was to estimate the upper bound of a partially-filled board by assuming every empty cell is filled with a white tower. I then modified the 'solution' function to track the best score seen and to ignore any board whose upper bound is less than than that best score.
That helped some, reducing a 4x4x5 board from 23s to 15s. To improve it further, I modified the upper bound function to assume that each Empty is filled with the best tower possible, consistent with the existing non-empty cell contents. That helped a great deal, reducing the 4x4x5 time to 2s.
Running it on 5x5x5 took 2600s, giving the following board:
G B G R B
R B W Y G
Y G R B R
B W Y G Y
G R B R B
with a score of 730.
I may make another modification and have it find all of the maximal-scoring boards, rather than just one.
If you don't want to do A*, use a branch and bound approach. The problem should be relatively easy to code up because your value functions are well defined. I imagine you should be able to prune off huge sections of the search space with relative ease. However because your search space is pretty large it may still take some time. Only one way to find out :)
The wiki article isn't the best in the world. Google can find you a ton of nice examples and trees and stuff to further illustrate the approach.
One easy way to improve the brute force method is to explore only legal states. For example, if you are trying all possible states, you will be testing many states where the top right corner is a white tower. All of these states will be illegal. It doesn't make sense to generate and test all of those states. So you want to generate your states one block at a time, and only go deeper into the tree when you are actually at a potentially valid state. This will cut down your search tree by many orders of magnitude.
There may be further fancy things you can do, but this is an easy to understand (hopefully) improvement to your current solution.
I think you will want to use a branch-and-bound algorithm because I think coming up with a good heuristic for an A* implementation will be hard (but, that's just my intuitition).
The pseudo-code for a branch-and-bound implementation is:
board = initial board with nothing on it, probably a 2D array
bestBoard = {}
function findBest(board)
if no more pieces can be added to board then
if score(board) > score(bestBoard) then
bestBoard = board
return
else
for each piece P we can legally add to board
newBoard = board with piece P added
//loose upper bound, could be improved
if score(newBoard) + 100*number of blanks in newBoard > score(bestBoard)
findBestHelper(newBoard)
The idea is that we search all possible boards, in order, but we keep track of the best one we have found so far (this is the bound). Then, if we find a partial board which we know will never be better than the best one so far then we stop looking working on that partial board: we trim that branch of the search tree.
In the code above I am doing the check by assuming that all the blanks would be filled by the white pieces, as we can't do better than that. I am sure that with a little bit of thought you can come up with a tighter bound than that.
Another place where you can try to optimize is in the order of the for-each loop. You want to try pieces in the order correct order. That is, optimally you want the first solution found to be the best one, or at least one with a really high score.
It seems like a good approach would be to start with a white tower and then build a set of towers around it based on the requirements, trying to find the smallest possible colored set of shapes which can act as interlocking tiles.
I wanted to advocate linear programming with integer unknowns, but it turns out that it's NP-hard even in the binary case. However, you can still get great success at optimizing a problem like yours, where there are many valid solutions and you simply want the best one.
Linear programming, for this kind of problem, essentially amounts to having a lot of variables (for example, the number of red towers present in cell (M, N)) and relationships among the variables (for example, the number of white towers in cell (M, N) must be less than or equal to the number of towers of the non-white color that has the smallest such number, among all its neighbors). It's kind of a pain to write up a linear program, but if you want a solution that runs in seconds, it's probably your best bet.
You've received a lot of good advice on the algorithmic side of things, so I don't have a lot to add. But, assuming Java as the language, here are a few fairly obvious suggestions for performance improvement.
Make sure you're not instantiating any objects inside that 4^16 loop. It's much, much cheaper for the JVM to re-initialize an existing object than to create a new one. Even cheaper to use arrays of primitives. :)
If you can help it, step away from the collection classes. They'll add a lot of overhead that you probably don't need.
Make sure you're not concatenating any strings. Use StringBuilder.
And lastly, consider re-writing the whole thing in C.

Finding the path with the maximum minimal weight

I'm trying to work out an algorithm for finding a path across a directed graph. It's not a conventional path and I can't find any references to anything like this being done already.
I want to find the path which has the maximum minimum weight.
I.e. If there are two paths with weights 10->1->10 and 2->2->2 then the second path is considered better than the first because the minimum weight (2) is greater than the minimum weight of the first (1).
If anyone can work out a way to do this, or just point me in the direction of some reference material it would be incredibly useful :)
EDIT:: It seems I forgot to mention that I'm trying to get from a specific vertex to another specific vertex. Quite important point there :/
EDIT2:: As someone below pointed out, I should highlight that edge weights are non negative.
I am copying this answer and adding also adding my proof of correctness for the algorithm:
I would use some variant of Dijkstra's. I took the pseudo code below directly from Wikipedia and only changed 5 small things:
Renamed dist to width (from line 3 on)
Initialized each width to -infinity (line 3)
Initialized the width of the source to infinity (line 8)
Set the finish criterion to -infinity (line 14)
Modified the update function and sign (line 20 + 21)
1 function Dijkstra(Graph, source):
2 for each vertex v in Graph: // Initializations
3 width[v] := -infinity ; // Unknown width function from
4 // source to v
5 previous[v] := undefined ; // Previous node in optimal path
6 end for // from source
7
8 width[source] := infinity ; // Width from source to source
9 Q := the set of all nodes in Graph ; // All nodes in the graph are
10 // unoptimized – thus are in Q
11 while Q is not empty: // The main loop
12 u := vertex in Q with largest width in width[] ; // Source node in first case
13 remove u from Q ;
14 if width[u] = -infinity:
15 break ; // all remaining vertices are
16 end if // inaccessible from source
17
18 for each neighbor v of u: // where v has not yet been
19 // removed from Q.
20 alt := max(width[v], min(width[u], width_between(u, v))) ;
21 if alt > width[v]: // Relax (u,v,a)
22 width[v] := alt ;
23 previous[v] := u ;
24 decrease-key v in Q; // Reorder v in the Queue
25 end if
26 end for
27 end while
28 return width;
29 endfunction
Some (handwaving) explanation why this works: you start with the source. From there, you have infinite capacity to itself. Now you check all neighbors of the source. Assume the edges don't all have the same capacity (in your example, say (s, a) = 300). Then, there is no better way to reach b then via (s, b), so you know the best case capacity of b. You continue going to the best neighbors of the known set of vertices, until you reach all vertices.
Proof of correctness of algorithm:
At any point in the algorithm, there will be 2 sets of vertices A and B. The vertices in A will be the vertices to which the correct maximum minimum capacity path has been found. And set B has vertices to which we haven't found the answer.
Inductive Hypothesis: At any step, all vertices in set A have the correct values of maximum minimum capacity path to them. ie., all previous iterations are correct.
Correctness of base case: When the set A has the vertex S only. Then the value to S is infinity, which is correct.
In current iteration, we set
val[W] = max(val[W], min(val[V], width_between(V-W)))
Inductive step: Suppose, W is the vertex in set B with the largest val[W]. And W is dequeued from the queue and W has been set the answer val[W].
Now, we need to show that every other S-W path has a width <= val[W]. This will be always true because all other ways of reaching W will go through some other vertex (call it X) in the set B.
And for all other vertices X in set B, val[X] <= val[W]
Thus any other path to W will be constrained by val[X], which is never greater than val[W].
Thus the current estimate of val[W] is optimum and hence algorithm computes the correct values for all the vertices.
You could also use the "binary search on the answer" paradigm. That is, do a binary search on the weights, testing for each weight w whether you can find a path in the graph using only edges of weight greater than w.
The largest w for which you can (found through binary search) gives the answer. Note that you only need to check if a path exists, so just an O(|E|) breadth-first/depth-first search, not a shortest-path. So it's O(|E|*log(max W)) in all, comparable to the Dijkstra/Kruskal/Prim's O(|E|log |V|) (and I can't immediately see a proof of those, too).
Use either Prim's or Kruskal's algorithm. Just modify them so they stop when they find out that the vertices you ask about are connected.
EDIT: You ask for maximum minimum, but your example looks like you want minimum maximum. In case of maximum minimum Kruskal's algorithm won't work.
EDIT: The example is okay, my mistake. Only Prim's algorithm will work then.
I am not sure that Prim will work here. Take this counterexample:
V = {1, 2, 3, 4}
E = {(1, 2), (2, 3), (1, 4), (4, 2)}
weight function w:
w((1,2)) = .1,
w((2,3)) = .3
w((1,4)) = .2
w((4,2)) = .25
If you apply Prim to find the maxmin path from 1 to 3, starting from 1 will select the 1 --> 2 --> 3 path, while the max-min distance is attained for the path that goes through 4.
This can be solved using a BFS style algorithm, however you need two variations:
Instead of marking each node as "visited", you mark it with the minimum weight along the path you took to reach it.
For example, if I and J are neighbors, I has value w1, and the weight of the edge between them is w2, then J=min(w1, w2).
If you reach a marked node with value w1, you might need to remark and process it again, if assigning a new value w2 (and w2>w1). This is required to make sure you get the maximum of all minimums.
For example, if I and J are neighbors, I has value w1, J has value w2, and the weight of the edge between them is w3, then if min(w2, w3) > w1 you must remark J and process all it's neighbors again.
Ok, answering my own question here just to try and get a bit of feedback I had on the tentative solution I worked out before posting here:
Each node stores a "path fragment", this is the entire path to itself so far.
0) set current vertex to the starting vertex
1) Generate all path fragments from this vertex and add them to a priority queue
2) Take the fragment off the top off the priority queue, and set the current vertex to the ending vertex of that path
3) If the current vertex is the target vertex, then return the path
4) goto 1
I'm not sure this will find the best path though, I think the exit condition in step three is a little ambitious. I can't think of a better exit condition though, since this algorithm doesn't close vertices (a vertex can be referenced in as many path fragments as it likes) you can't just wait until all vertices are closed (like Dijkstra's for example)
You can still use Dijkstra's!
Instead of using +, use the min() operator.
In addition, you'll want to orient the heap/priority_queue so that the biggest things are on top.
Something like this should work: (i've probably missed some implementation details)
let pq = priority queue of <node, minimum edge>, sorted by min. edge descending
push (start, infinity) on queue
mark start as visited
while !queue.empty:
current = pq.top()
pq.pop()
for all neighbors of current.node:
if neighbor has not been visited
pq.decrease_key(neighbor, min(current.weight, edge.weight))
It is guaranteed that whenever you get to a node you followed an optimal path (since you find all possibilities in decreasing order, and you can never improve your path by adding an edge)
The time bounds are the same as Dijkstra's - O(Vlog(E)).
EDIT: oh wait, this is basically what you posted. LOL.

Resources