I was wondering about an optimisation problem related to graph's exploration :
Suppose we have a connected weighted graph, and each vertex has between 1 and 4 connections with other vertices. Now let's say some vertices contain chocolate ,and we put on that graph at two different vertices two students both of them controlled by two AIs that have access to the position of the chocolates, the position of the other student and the graph, and at each turn, both of them can move to a connected vertex (and need K moves if the weigh of the edge is K). Finally, if a student is on a vertex containing chocolate he eats it.
My question : What would the best algorithm for the AI so that the student controlled will eat more chocolate then the other ?
Thank you .
The best approach to this would likely be using cost analysis and heuristics.
Your bot has 1-4 choices it can make, so you should consider each one. What's the benefit of going up? What's the cost? If you take value as benefit - cost, then you should choose the most valuable option.
So... How do you even calculate benefit and cost, anyways? You've outlined a few conditions already. The K value is a cost. Whether there is a chocolate there is a benefit. Maybe analyze the number of chocolates adjacent to a given vertex?
But wait! You know your enemy's position, and your enemy knows yours! If you move in a given direction, it'll influence their move (If they're smart.) So you'll have to look ahead and analyze what your enemy will probably do in response to you. Chess solvers use breadth-first lookahead to plan out the best possible move via this exact system. Unfortunately, your problem is unbounded - There is no true win condition. So your bot will look ahead indefinitely (or, until it runs out of memory, anyways.) This is a problem, because calculations take time. Chess solvers get around this by imposing a time limit. They take the best move they've found by the time they run out of time.
Now, I've given you a general framework to work within. You still need to assemble it. Part of making heuristics is weighing your costs and benefits. Maybe the cost of K is insignificant compared to the benefit of getting a chocolate? Maybe it isnt? Use constant coefficients to weigh the costs and benefits.
As for finding your way through the grid, I'd take a look at Dijkstra's algorithm or A* to move.
Oh, and before I forget, here's the wikipedia article on graph traversal. You may find it useful.
Related
I have a directed, weighted routing graph (ca. 10^5 edges, 4 edges per node, lots of circles).
Each edge has a cost associated with it. How can I rate the "connectedness" of each node? It should be a measure of how cheap it is to reach other nodes from this one.
How does everything change if every node gets a reliability factor (the probability that the chosen path containing that node will fail and a new one must be found)?
Thanks for your help
I believe that the problem you've put forth in many terms matches the use case of the PageRank algorithm.
I won't discuss how the algorithm works in general since there are a lot many blogs/videos available online which already explains it in great detail. One of my personal favourite short video on the same is this.
Now lets see how does the algorithm fits for your use case. Let's define connectedness of a node x as C(x). We can rephrase your given statement "how cheap it is to reach other nodes from this node" to "how likely are we to end up on the given node in a random walk across the graph such that we are biased to take edges whose costs are less".
The statement to a large extent relates to the ideology behind the PageRank algorithm. We just need to consider how to include the edge cost for our working.
The original PageRank algorithm uniformly divides the page rank of a given node to all it's adjacent node (denoted as PR(y) / OUT(y) in formula). We on the other hand needs to be more biased towards edges with lower cost for which I'll recommend modifying the formula to,
(SUM-EDGES-COST(y) - EDGE-COST(x, y)) * (C(y) / SUM-EDGES-COST(y))
instead of the traditional C(x) / OUT(x). We take the difference (SUM-EDGES-COST(y) - EDGE-COST(x, y)) since in our scenario lower edge cost means more connectedness. Another possibility is to apply softmax function to the edge cost for each node as a normalisation strategy.
As to answer the part about having a reliability factor, given by R(x) for a node x, we can just multiply it directly to C(x) in the formula.
To wrap things up,
should match your given scenario.
What I've presented here is just one possibility which I can think from the top of my mind and it's highly likely that it just might not work out. All I can hope is that it helps you out in some or the other way. Cheers! :)
There is a network of towns, connected by roads of various integer lengths.
A traveler wishes to travel in his car from one town to another. However, he does not want to minimize distance traveled; instead he wishes to minimize the petrol cost of the journey. Petrol can be bought in any city, however each city supplies petrol at various (integer) prices (hence why the shortest route is not necessarily the cheapest). 1 unit of petrol enables him to drive for 1 unit of distance.
His car can only hold so much petrol in the tank, and he can choose how many units of petrol to purchase at each city he travels through. Find the minimum petrol cost.
Does anyone know an efficient algorithm that could be used to solve this problem? Even the name of this type of problem would be useful so that I can research it myself! Obviously it's not quite the same as a shortest path problem. Any other tips appreciated!
EDIT - the actual problem I have states that there will be <1000 cities; <10000 roads; and the petrol tank capacity will be somewhere between 1 and 100.
You could solve this directly using Djikstra's algorithm if you are happy to increase the size of the graph.
Suppose your petrol tank could hold from 0 to 9 units of petrol.
The idea would be to split each town into 10 nodes, with node x for town t representing being at town t with x units of petrol in the tank.
You can then construct zero-cost edges on this expanded graph to represent travelling between different towns (using up petrol in the process so you would go from a level 8 node to a level 5 node if the distance was 3), and more edges to represent filling up the tank at each town with one unit of petrol (with cost depending on the town).
Then applying Djikstra should give the lowest cost path from the start to the end.
I think the question is: Is there a chance the petrol stuff makes the underlying traveling salesman problem computationally more feasible? If not, there is no efficient non-approximating algorithm.
Of course, you can find efficient solutions for edge cases, and there might be more edge cases with the petrol condition, as in, always take this city first because the petrol is so cheap.
I think you can solve this with dynamic programming. For each node, you save an array of tuples of petrol cost and the length of the path where you use that petrol, containing the optimal solution. Every step you loop trough all nodes and if there is a node you can go, which already has a solution, you loop trough all the nodes you can go to with a solution. You select the minimum cost, but note: you have to account for the petrol cost in the current node. All costs in the array that are higher than the cost in the current node, can instead be bought at the current node. Note that nodes which already have a solution should be recalculated, as the nodes you can go to from there could change. You start with the end node, setting the solution to an empty array (or one entry with cost and length 0). The final solution is to take the solution at the beginning and sum up every cost * length.
I'd try this:
Find the shortest route from start to destination. Dijkstra's algorithm is appropriate for this.
Find the minimum cost of petrol to travel this route. I'm not aware of any off-the-shelf algorithm for this, but unless there are many cities along the route even an exhaustive search shouldn't be computationally infeasible.
Find the next shortest route ...
Defining precise stopping criteria is a bit of a challenge, it might be best just to stop once the minimum cost found for a newly-tested route is greater than the minimum cost for a route already tested.
So, use 2 algorithms, one for each part of the problem.
This might be optimized suitably well using a Genetic Algorithm. Genetic Algorithms beat humans at some complex problems:
http://en.wikipedia.org/wiki/Genetic_algorithm
The gist of a Genetic Algorithm is:
Come up with a ranking function for candidate solutions
Come up with a pool of unique candidate solutions. Initialize it
with some randomly-generated possibilities. Maybe 10 or 100 or
1000...
Copy a candidate solution from the pool and perturb it in some way -
add a town, remove a town, add two towns, etc. This might improve
or worsen matters - your ranking function will help you tell. Which
one do you pick? Usually, you pick the best, but once in a while,
you intentionally pick one that's not to avoid getting stuck on a
local optimum.
Has the new solution already been ranked? If yes, junk it and go to
If no, continue...
Add the perturbed candidate back to the pool under its newly-calculated rank
Keep going at this (repeat from #3) until you feel you've done it long enough
Finally, select the answer with the best rank. It may not be
optimal, but it should be pretty good.
You could also formulate that as an integer linear programming (ILP) problem. The advantage is that there is a number of off-the-shelf solvers for this task and the complexity won't grow so fast as in the case of Peters solution with the size of the tank.
The variables in this particular problem will be the amounts of petrol purchased in any one town, the amount in the cars tank in any town on the way and actual roads taken.
The constraints will have to guarantee that the car spends the necessary fuel on every road and does not have less that 0 or more than MAX units of fuel in any town and that the roads constitute a path from A to B.
The objective will be the total cost of the fuel purchased.
The whole thing may look monstrous (ILP formulations often do), but it does not mean it cannot be solved in a reasonable time.
In a tower defense game, you have an NxM grid with a start, a finish, and a number of walls.
Enemies take the shortest path from start to finish without passing through any walls (they aren't usually constrained to the grid, but for simplicity's sake let's say they are. In either case, they can't move through diagonal "holes")
The problem (for this question at least) is to place up to K additional walls to maximize the path the enemies have to take. For example, for K=14
My intuition tells me this problem is NP-hard if (as I'm hoping to do) we generalize this to include waypoints that must be visited before moving to the finish, and possibly also without waypoints.
But, are there any decent heuristics out there for near-optimal solutions?
[Edit] I have posted a related question here.
I present a greedy approach and it's maybe close to the optimal (but I couldn't find approximation factor). Idea is simple, we should block the cells which are in critical places of the Maze. These places can help to measure the connectivity of maze. We can consider the vertex connectivity and we find minimum vertex cut which disconnects the start and final: (s,f). After that we remove some critical cells.
To turn it to the graph, take dual of maze. Find minimum (s,f) vertex cut on this graph. Then we examine each vertex in this cut. We remove a vertex its deletion increases the length of all s,f paths or if it is in the minimum length path from s to f. After eliminating a vertex, recursively repeat the above process for k time.
But there is an issue with this, this is when we remove a vertex which cuts any path from s to f. To prevent this we can weight cutting node as high as possible, means first compute minimum (s,f) cut, if cut result is just one node, make it weighted and set a high weight like n^3 to that vertex, now again compute the minimum s,f cut, single cutting vertex in previous calculation doesn't belong to new cut because of waiting.
But if there is just one path between s,f (after some iterations) we can't improve it. In this case we can use normal greedy algorithms like removing node from a one of a shortest path from s to f which doesn't belong to any cut. after that we can deal with minimum vertex cut.
The algorithm running time in each step is:
min-cut + path finding for all nodes in min-cut
O(min cut) + O(n^2)*O(number of nodes in min-cut)
And because number of nodes in min cut can not be greater than O(n^2) in very pessimistic situation the algorithm is O(kn^4), but normally it shouldn't take more than O(kn^3), because normally min-cut algorithm dominates path finding, also normally path finding doesn't takes O(n^2).
I guess the greedy choice is a good start point for simulated annealing type algorithms.
P.S: minimum vertex cut is similar to minimum edge cut, and similar approach like max-flow/min-cut can be applied on minimum vertex cut, just assume each vertex as two vertex, one Vi, one Vo, means input and outputs, also converting undirected graph to directed one is not hard.
it can be easily shown (proof let as an exercise to the reader) that it is enough to search for the solution so that every one of the K blockades is put on the current minimum-length route. Note that if there are multiple minimal-length routes then all of them have to be considered. The reason is that if you don't put any of the remaining blockades on the current minimum-length route then it does not change; hence you can put the first available blockade on it immediately during search. This speeds up even a brute-force search.
But there are more optimizations. You can also always decide that you put the next blockade so that it becomes the FIRST blockade on the current minimum-length route, i.e. you work so that if you place the blockade on the 10th square on the route, then you mark the squares 1..9 as "permanently open" until you backtrack. This saves again an exponential number of squares to search for during backtracking search.
You can then apply heuristics to cut down the search space or to reorder it, e.g. first try those blockade placements that increase the length of the current minimum-length route the most. You can then run the backtracking algorithm for a limited amount of real-time and pick the best solution found thus far.
I believe we can reduce the contained maximum manifold problem to boolean satisifiability and show NP-completeness through any dependency on this subproblem. Because of this, the algorithms spinning_plate provided are reasonable as heuristics, precomputing and machine learning is reasonable, and the trick becomes finding the best heuristic solution if we wish to blunder forward here.
Consider a board like the following:
..S........
#.#..#..###
...........
...........
..........F
This has many of the problems that cause greedy and gate-bound solutions to fail. If we look at that second row:
#.#..#..###
Our logic gates are, in 0-based 2D array ordered as [row][column]:
[1][4], [1][5], [1][6], [1][7], [1][8]
We can re-render this as an equation to satisfy the block:
if ([1][9] AND ([1][10] AND [1][11]) AND ([1][12] AND [1][13]):
traversal_cost = INFINITY; longest = False # Infinity does not qualify
Excepting infinity as an unsatisfiable case, we backtrack and rerender this as:
if ([1][14] AND ([1][15] AND [1][16]) AND [1][17]:
traversal_cost = 6; longest = True
And our hidden boolean relationship falls amongst all of these gates. You can also show that geometric proofs can't fractalize recursively, because we can always create a wall that's exactly N-1 width or height long, and this represents a critical part of the solution in all cases (therefore, divide and conquer won't help you).
Furthermore, because perturbations across different rows are significant:
..S........
#.#........
...#..#....
.......#..#
..........F
We can show that, without a complete set of computable geometric identities, the complete search space reduces itself to N-SAT.
By extension, we can also show that this is trivial to verify and non-polynomial to solve as the number of gates approaches infinity. Unsurprisingly, this is why tower defense games remain so fun for humans to play. Obviously, a more rigorous proof is desirable, but this is a skeletal start.
Do note that you can significantly reduce the n term in your n-choose-k relation. Because we can recursively show that each perturbation must lie on the critical path, and because the critical path is always computable in O(V+E) time (with a few optimizations to speed things up for each perturbation), you can significantly reduce your search space at a cost of a breadth-first search for each additional tower added to the board.
Because we may tolerably assume O(n^k) for a deterministic solution, a heuristical approach is reasonable. My advice thus falls somewhere between spinning_plate's answer and Soravux's, with an eye towards machine learning techniques applicable to the problem.
The 0th solution: Use a tolerable but suboptimal AI, in which spinning_plate provided two usable algorithms. Indeed, these approximate how many naive players approach the game, and this should be sufficient for simple play, albeit with a high degree of exploitability.
The 1st-order solution: Use a database. Given the problem formulation, you haven't quite demonstrated the need to compute the optimal solution on the fly. Therefore, if we relax the constraint of approaching a random board with no information, we can simply precompute the optimum for all K tolerable for each board. Obviously, this only works for a small number of boards: with V! potential board states for each configuration, we cannot tolerably precompute all optimums as V becomes very large.
The 2nd-order solution: Use a machine-learning step. Promote each step as you close a gap that results in a very high traversal cost, running until your algorithm converges or no more optimal solution can be found than greedy. A plethora of algorithms are applicable here, so I recommend chasing the classics and the literature for selecting the correct one that works within the constraints of your program.
The best heuristic may be a simple heat map generated by a locally state-aware, recursive depth-first traversal, sorting the results by most to least commonly traversed after the O(V^2) traversal. Proceeding through this output greedily identifies all bottlenecks, and doing so without making pathing impossible is entirely possible (checking this is O(V+E)).
Putting it all together, I'd try an intersection of these approaches, combining the heat map and critical path identities. I'd assume there's enough here to come up with a good, functional geometric proof that satisfies all of the constraints of the problem.
At the risk of stating the obvious, here's one algorithm
1) Find the shortest path
2) Test blocking everything node on that path and see which one results in the longest path
3) Repeat K times
Naively, this will take O(K*(V+ E log E)^2) but you could with some little work improve 2 by only recalculating partial paths.
As you mention, simply trying to break the path is difficult because if most breaks simply add a length of 1 (or 2), its hard to find the choke points that lead to big gains.
If you take the minimum vertex cut between the start and the end, you will find the choke points for the entire graph. One possible algorithm is this
1) Find the shortest path
2) Find the min-cut of the whole graph
3) Find the maximal contiguous node set that intersects one point on the path, block those.
4) Wash, rinse, repeat
3) is the big part and why this algorithm may perform badly, too. You could also try
the smallest node set that connects with other existing blocks.
finding all groupings of contiguous verticies in the vertex cut, testing each of them for the longest path a la the first algorithm
The last one is what might be most promising
If you find a min vertex cut on the whole graph, you're going to find the choke points for the whole graph.
Here is a thought. In your grid, group adjacent walls into islands and treat every island as a graph node. Distance between nodes is the minimal number of walls that is needed to connect them (to block the enemy).
In that case you can start maximizing the path length by blocking the most cheap arcs.
I have no idea if this would work, because you could make new islands using your points. but it could help work out where to put walls.
I suggest using a modified breadth first search with a K-length priority queue tracking the best K paths between each island.
i would, for every island of connected walls, pretend that it is a light. (a special light that can only send out horizontal and vertical rays of light)
Use ray-tracing to see which other islands the light can hit
say Island1 (i1) hits i2,i3,i4,i5 but doesn't hit i6,i7..
then you would have line(i1,i2), line(i1,i3), line(i1,i4) and line(i1,i5)
Mark the distance of all grid points to be infinity. Set the start point as 0.
Now use breadth first search from the start. Every grid point, mark the distance of that grid point to be the minimum distance of its neighbors.
But.. here is the catch..
every time you get to a grid-point that is on a line() between two islands, Instead of recording the distance as the minimum of its neighbors, you need to make it a priority queue of length K. And record the K shortest paths to that line() from any of the other line()s
This priority queque then stays the same until you get to the next line(), where it aggregates all priority ques going into that point.
You haven't showed the need for this algorithm to be realtime, but I may be wrong about this premice. You could then precalculate the block positions.
If you can do this beforehand and then simply make the AI build the maze rock by rock as if it was a kind of tree, you could use genetic algorithms to ease up your need for heuristics. You would need to load any kind of genetic algorithm framework, start with a population of non-movable blocks (your map) and randomly-placed movable blocks (blocks that the AI would place). Then, you evolve the population by making crossovers and transmutations over movable blocks and then evaluate the individuals by giving more reward to the longest path calculated. You would then simply have to write a resource efficient path-calculator without the need of having heuristics in your code. In your last generation of your evolution, you would take the highest-ranking individual, which would be your solution, thus your desired block pattern for this map.
Genetic algorithms are proven to take you, under ideal situation, to a local maxima (or minima) in reasonable time, which may be impossible to reach with analytic solutions on a sufficiently large data set (ie. big enough map in your situation).
You haven't stated the language in which you are going to develop this algorithm, so I can't propose frameworks that may perfectly suit your needs.
Note that if your map is dynamic, meaning that the map may change over tower defense iterations, you may want to avoid this technique since it may be too intensive to re-evolve an entire new population every wave.
I'm not at all an algorithms expert, but looking at the grid makes me wonder if Conway's game of life might somehow be useful for this. With a reasonable initial seed and well-chosen rules about birth and death of towers, you could try many seeds and subsequent generations thereof in a short period of time.
You already have a measure of fitness in the length of the creeps' path, so you could pick the best one accordingly. I don't know how well (if at all) it would approximate the best path, but it would be an interesting thing to use in a solution.
Say, we have a circular list representing a solution of the traveling salesman problem. This list is initially empty.
If the user is allowed to enter a city and it's coordinate one by one, what heuristics could be used to insert those coordinates into the already existing tour?
An example uses the nearest neighbor heuristic : it inserts the new coordinate after the nearest coordinate already in the tour.
What are some other options (pseudo-code if possible).
There are plenty of construction heuristics you can use, such as First Fit, First Fit Decreasing, Best Fit, Best Fit Decreasing and Cheapest Insertion.
Those constructions heuristics are applied on bin packing normally, but they can be converted to TSP too. Documentation about those heuristics is here.
Since you're only inserting 1 unassigned entity at at time, all of these basically revert to what you call nearest neighbor heuristic (with a slight variation on ties), but note that that is not what they usually call Nearest Neighbor. Nearest Neighbor always adds to the end of the line, the nearest neighbor of all unassigned entities.
Now, what you really want, is a decent solution, without having to restart your entire construction heuristics. That's harder: welcome to repeated planning and real-time planning (and this documentation). I am working on a open source example for TSP and vehicle routing that does real-time planning.
You can of course generalize the idea you have mentioned:
Define k'th_path(v) = minimum weight of a path including max{k,not_visited cities} cities
Note that calculating the k'th path is O(|V|^k) [this bound is not tight]
Special cases:
For k=1 you get the nearest neighbor, as you suggested.
for k=|V| you get an optimal solution [note it will be very expansive to calculate].
There are not other heuristic because TSP is always about to find the nearest coordinate. At least I don't know an algorithm that can insert a coordinate and knows the nearest coordinate but there are plenty algorithm to find a good tour. A good heuristic is for example the Christofides algorithm, it works only in euklidian space but it give you a guarantee of the solution to be within 3/2 of the optimum. It's not very easy to code. Especially the edmond blossom v algorithm is for an expert skill. The importance of a guarantee isn't high enough because how would you explain that your method can deliver non-sense in some rare situation?
I've been playing around with some things and thought up the idea of trying to figure out Kevin Bacon numbers. I have data for a site that for this purpose we can consider a social network. Let's pretend that it's Facebook (for simplification of discussion). I have people and I have a list of their friends, so I have the connections between them. How can I calculate the distance from one person to another (basically, a Kevin Bacon number)?
My best idea is a Bidirectional search, with a depth limit (to limit computational complexity and avoid the problem of people who simply can't be connected in the graph), but I realize this is rather brute force.
Could it be better to make little sub-graphs (say something equivalent to groups on Facebook), calculate the shortest distances between them (ahead of time, perhaps) and then try to use THOSE to find a link? While this requires pre-calculation, it could make it possible to search many fewer nodes (nodes could be groups instead of individuals, making the graph much smaller). This would still be a bidirectional search though.
I could also pre-calculate the number of people an individual is connected to, searching the nodes for "popular" people first since they could have the best chance of connecting to the given destination individual. I realize this would be a trade-off of speed for possible shortest path. I'd think I'd also want to use a depth-first search instead of the breadth-first search I'd plan to use in the other cases.
Can someone think of a simpler/faster way of doing this? I'd like to be able to find the shortest length between two people, so it's not as easy as always having the same end point (such as in the Kevin Bacon problem).
I realize that there are problems like I could get chains of 200 people and such, but that can be solved my having a limit to the depth I'm willing to search.
This is a standard shortest path problem. There are lots of solutions, including Dijkstra's algorithm and Bellman-Ford. You may be particularly interested in looking at the A* algorithm and seeing how it would perform with the cost function relative to the inverse of any particular node's degree. The idea would be to visit more popular nodes (those with higher degree) first.
Sounds like a job for
Dijkstra's algorithm.
ED: Eh, I shouldn't have pulled the trigger so fast. Dijkstra's (and Bellman-Ford) reduces to a breadth-first search when the weights are 1, so this isn't too useful. Oh well.
The A* algorithm, mentioned by tvanfosson, may be ideal for this. The idea is that instead of searching and recursing in whatever order the elements are in each level of the tree (rooted on your start- or end-point), you use some heuristic to determine which element you are going to try first. In your case a good bet would probably be the degree of a node (number of "friends"), but you could possibly want to use the number of people within some arbitrary number of degrees of a given person (i.e., the guy who has has three friends who each have 100 friends is likely to be a better node than the guy who has 20 friends in a clique that shuns outsiders). There's all sorts of other things you could use as a heuristic (friends get 2 points, friends-of-friends get 1 point; whatever, experiment).
Combine this with a depth limit (cut off after 6 degrees of separation, or whatever), and you can vastly improve your average case (worst case is still the same as basic BFS).
run a breadth-first search in both directions (from each endpoint) and stop when you have a connection or reach your depth limit
This one might be better overall Floyd-Warshall the all pairs shortest distance.