Suggest an algorithm (graph - possibly NP-Complete) - algorithm

There is a network of towns, connected by roads of various integer lengths.
A traveler wishes to travel in his car from one town to another. However, he does not want to minimize distance traveled; instead he wishes to minimize the petrol cost of the journey. Petrol can be bought in any city, however each city supplies petrol at various (integer) prices (hence why the shortest route is not necessarily the cheapest). 1 unit of petrol enables him to drive for 1 unit of distance.
His car can only hold so much petrol in the tank, and he can choose how many units of petrol to purchase at each city he travels through. Find the minimum petrol cost.
Does anyone know an efficient algorithm that could be used to solve this problem? Even the name of this type of problem would be useful so that I can research it myself! Obviously it's not quite the same as a shortest path problem. Any other tips appreciated!
EDIT - the actual problem I have states that there will be <1000 cities; <10000 roads; and the petrol tank capacity will be somewhere between 1 and 100.

You could solve this directly using Djikstra's algorithm if you are happy to increase the size of the graph.
Suppose your petrol tank could hold from 0 to 9 units of petrol.
The idea would be to split each town into 10 nodes, with node x for town t representing being at town t with x units of petrol in the tank.
You can then construct zero-cost edges on this expanded graph to represent travelling between different towns (using up petrol in the process so you would go from a level 8 node to a level 5 node if the distance was 3), and more edges to represent filling up the tank at each town with one unit of petrol (with cost depending on the town).
Then applying Djikstra should give the lowest cost path from the start to the end.

I think the question is: Is there a chance the petrol stuff makes the underlying traveling salesman problem computationally more feasible? If not, there is no efficient non-approximating algorithm.
Of course, you can find efficient solutions for edge cases, and there might be more edge cases with the petrol condition, as in, always take this city first because the petrol is so cheap.

I think you can solve this with dynamic programming. For each node, you save an array of tuples of petrol cost and the length of the path where you use that petrol, containing the optimal solution. Every step you loop trough all nodes and if there is a node you can go, which already has a solution, you loop trough all the nodes you can go to with a solution. You select the minimum cost, but note: you have to account for the petrol cost in the current node. All costs in the array that are higher than the cost in the current node, can instead be bought at the current node. Note that nodes which already have a solution should be recalculated, as the nodes you can go to from there could change. You start with the end node, setting the solution to an empty array (or one entry with cost and length 0). The final solution is to take the solution at the beginning and sum up every cost * length.

I'd try this:
Find the shortest route from start to destination. Dijkstra's algorithm is appropriate for this.
Find the minimum cost of petrol to travel this route. I'm not aware of any off-the-shelf algorithm for this, but unless there are many cities along the route even an exhaustive search shouldn't be computationally infeasible.
Find the next shortest route ...
Defining precise stopping criteria is a bit of a challenge, it might be best just to stop once the minimum cost found for a newly-tested route is greater than the minimum cost for a route already tested.
So, use 2 algorithms, one for each part of the problem.

This might be optimized suitably well using a Genetic Algorithm. Genetic Algorithms beat humans at some complex problems:
http://en.wikipedia.org/wiki/Genetic_algorithm
The gist of a Genetic Algorithm is:
Come up with a ranking function for candidate solutions
Come up with a pool of unique candidate solutions. Initialize it
with some randomly-generated possibilities. Maybe 10 or 100 or
1000...
Copy a candidate solution from the pool and perturb it in some way -
add a town, remove a town, add two towns, etc. This might improve
or worsen matters - your ranking function will help you tell. Which
one do you pick? Usually, you pick the best, but once in a while,
you intentionally pick one that's not to avoid getting stuck on a
local optimum.
Has the new solution already been ranked? If yes, junk it and go to
If no, continue...
Add the perturbed candidate back to the pool under its newly-calculated rank
Keep going at this (repeat from #3) until you feel you've done it long enough
Finally, select the answer with the best rank. It may not be
optimal, but it should be pretty good.

You could also formulate that as an integer linear programming (ILP) problem. The advantage is that there is a number of off-the-shelf solvers for this task and the complexity won't grow so fast as in the case of Peters solution with the size of the tank.
The variables in this particular problem will be the amounts of petrol purchased in any one town, the amount in the cars tank in any town on the way and actual roads taken.
The constraints will have to guarantee that the car spends the necessary fuel on every road and does not have less that 0 or more than MAX units of fuel in any town and that the roads constitute a path from A to B.
The objective will be the total cost of the fuel purchased.
The whole thing may look monstrous (ILP formulations often do), but it does not mean it cannot be solved in a reasonable time.

Related

Game of pathfinding in a weighted graph

I was wondering about an optimisation problem related to graph's exploration :
Suppose we have a connected weighted graph, and each vertex has between 1 and 4 connections with other vertices. Now let's say some vertices contain chocolate ,and we put on that graph at two different vertices two students both of them controlled by two AIs that have access to the position of the chocolates, the position of the other student and the graph, and at each turn, both of them can move to a connected vertex (and need K moves if the weigh of the edge is K). Finally, if a student is on a vertex containing chocolate he eats it.
My question : What would the best algorithm for the AI so that the student controlled will eat more chocolate then the other ?
Thank you .
The best approach to this would likely be using cost analysis and heuristics.
Your bot has 1-4 choices it can make, so you should consider each one. What's the benefit of going up? What's the cost? If you take value as benefit - cost, then you should choose the most valuable option.
So... How do you even calculate benefit and cost, anyways? You've outlined a few conditions already. The K value is a cost. Whether there is a chocolate there is a benefit. Maybe analyze the number of chocolates adjacent to a given vertex?
But wait! You know your enemy's position, and your enemy knows yours! If you move in a given direction, it'll influence their move (If they're smart.) So you'll have to look ahead and analyze what your enemy will probably do in response to you. Chess solvers use breadth-first lookahead to plan out the best possible move via this exact system. Unfortunately, your problem is unbounded - There is no true win condition. So your bot will look ahead indefinitely (or, until it runs out of memory, anyways.) This is a problem, because calculations take time. Chess solvers get around this by imposing a time limit. They take the best move they've found by the time they run out of time.
Now, I've given you a general framework to work within. You still need to assemble it. Part of making heuristics is weighing your costs and benefits. Maybe the cost of K is insignificant compared to the benefit of getting a chocolate? Maybe it isnt? Use constant coefficients to weigh the costs and benefits.
As for finding your way through the grid, I'd take a look at Dijkstra's algorithm or A* to move.
Oh, and before I forget, here's the wikipedia article on graph traversal. You may find it useful.

A* - Graph Traversal Heuristic

I have a graph that represents a city. I know the location of places of interest (nodes, which have a Importance value), the location of the hotel I'm staying in, how the nodes are connected, the traversal time between them and have acess to latitude and longitude. There are no issues converting from time to distance and vice-versa.
The objective is to tour the city, maximizing the importance per day but limiting one day of travel to 10 hours. A day begins and ends at the hotel. I have a working A* algorithm that chooses the lowest value but with no heuristic yet, which I guess makes it a BB for now. With that in mind:
Since I have access to Lat/Long, my first stab at an heuristic, while
only dealing with times, would be the distance as the crow flies
between a node and the hotel. Would this be an admissible heuristic?
It gives me the shortest possible distance and time, so it wouldn't
overestimate.
Now let's say the Importance of a node is between 1-4. In order to factor it in, one idea could be g(neighbor) = g(current) + (edge_cost / Importance^2). Assuming this would be valid (if not, why?):
But now the heuristic values would be in a different unit. Could a solution to this simply be give the Hotel Importance = 1? If the value is the same, will it still be admissible? EDIT: I think this will end up giving me problems because of the difference in scale.
I still have to restrict the total amount of time. Should each node keep track of the total time spent, in order to compare to the limit, plus the g() and h() values, because of the different units?
And finally:
Since I have to start and end in the same node, what comes to mind is to explore a node and should I find the hotel see if I still have time to explore the neighbors instead of going back. However, if I still have time to expand to one more node, but time runs out and I can't get to the hotel from there, I'm assuming I'll have to backtrack to the parent.
I can't help but see similarities to the knapsack problem. Even though I have to use A*, is there any lesson I can take from it?
Must my heuristic be consistent in this case? If so, why?
By the way, the purpose here is pathfinding first, optimizations second.
This actually looks like a combination of the travelling salesman problem (TSP) and knapsack problem (KP). It's KP in this respect: the knapsack capacity is 10 (for total hours available in a day) and the locations are the items. The item value equals the location value. The item weight is equal to the time it takes to travel to the location (plus the location's portion of the trip back to the hotel). The challenge arises from the fact that an item's weight is unknown until you solve the optimal tour through the selected locations--enter the TSP and Pathfinding.
One approach might be to use a pathfinding algorithm (e.g. A*, Bellman–Ford, or Dijkstra's algorithm) primarily to compute a distance matrix between each node. The distance matrix can then be leveraged while solving the TSP portion of the problem: finding a tour through the locations and using the total time as the weight.
The next step is up to you. If you are looking for an approximate solution, many heuristics exist for both TSP and KP: See Christofides TSP Heuristic, or the Minimum TSP and Maximum Knapsack entries at the Compendium of NP Optimization problems.
If on the other hand you seek an optimal solution, you may be out of luck. Still I recommend you find a copy of Graph Theory. An Algorithmic Approach by Nicos Christofides (ISBN-13: 978-0121743505). It provides heuristics for early backtracking in a Depth-First-Search that expedite the search for optimal solutions to several NP-Complete problems.

Looking for an algorithm for an efficient itinerary

Am wanting to write an app that helps say a travelling salesman / musician plan their tour.
So this is about making an efficient itinerary.
So they would put in their start and end points and the places they want to visit and the program would output a suggested route to encompass those points on a map.
The suggested route would obviously minimise the time, distance and financial costs assuming the edge info is given for nodes on the network.
Could someone post in some pseudo code or pointers to sites that describe the necessary algorithm(s) required to solve this problem.
I've looked at A* but that seems to be for just a start and end points.
Any ideas welcome
thanks
Alex
As the other authors wrote, it is a traveling saleman problem. For your musician tour planer you require both, (i) an algorithm that calculates the least cost path from one location to another [via a physical network] AND (ii) an algorithm that calculates the best sequence of locations given your start and end location and additional constraints such as time windows.
(i) can be solved with for example Dijkstra, A*, Contraction Hierarchies
(ii) can be solved with Held-Karp, Branch and Bound/Cut [exactly] and Lin-Kernighan, or any other (meta)heuristic that are also applied to solve vehicle routing problems (VRP) [heuristically]
However, implementing these algorithms efficiently is not matter of days. Thus I would recommend you to use existing software. For (i) GraphHopper will be a perfect choice and for (ii) you can try jsprit. Both are written in Java and are Open Source.
TSP (Travelling Salesman Problem) is what you want, you just will have to adjust the cost function so that it's not purely based on distance. Most likely you'll want to translate distance to an actual dollar value that accounts for the travel cost as well as the travel time.
It might be good to have a slider to bias the calculation towards either travel time or travel cost (time is money and all). Although, it's not clear how helpful it would be given how computationally intensive it tends to be to optimize a TSP instance.
As mentioned, this is a case of the traveling salesman problem (TSP). Note that the TSP can be solved with a brute-force algorithm when n, the number of cities, is not too large. A musician may only want to visit 15 cities or less, so you can still find the best path with a brute-force search. You just need to calculate the weighting between the different cities (distance and other factors) and then check all possibilities to find the best possible routes. If there are more than 20 cities, you can still find the optimal solution, but you'll want a better algorithm than a direct brute-force search.

Travelling by bus

If you have the full bus schedule for a country, how can you find the
furthest anyone can travel in one day without visiting the same stop twice?
I assume a bus schedule gives you the full list of leaving and arriving times for every bus stop.
A slow and naive method would be as follows.
You can of course make a graph from the bus schedule with multiple directed edges between bus stops. You could then do a depth first search remembering the arrival time of the edge you took to get to each node and only taking edges from that stop that leave after the one that you took to get there. If you go to a node you have been to before you would only carry on from there if the current time in your traversal is before the earliest time you had ever visited that node before. You could record the furthest you can get from each node and then you could check each node to find the furthest you can travel overall.
This seems very inefficient however and it really isn't a normal graph problem. The problem is that in a normal directed graph if you can get from A to B and from B to C then you can get from A to C. This isn't true here.
What is the fastest you can solve this problem?
I think your original algorithm is pretty good.
You can think of your approach as being a version of Dijkstra's algorithm, in attempting to find the shortest path to each node.
Note that it is best at this stage to weight edges in the graph in terms of time. The idea is to use your Dijkstra-like algorithm to compute all nodes reachable within 1 days worth of time, and then pick whichever of these nodes is furthest in space from the start point.
Implementations of Dijkstra can use a heap to retrieve the next node to explore in O(logn), and I think this would be a good enhancement to your approach as well. If you always choose the node that you can reach earliest, you never need to repeat the calculation for that node.
Overall the approach is:
For each starting point
Use a modified Dijkstra to compute all nodes reachable in 1 day
Find the furthest in space of all these nodes.
So for n starting points and e bus routes, the complexity is about O(n(n+e)log(n)) to get the optimal answer.
You should be able to get improved performance by using an appropriate heuristic in an A* search. The heuristic needs to underestimate the max distance possible from a point, so you could use the maximum speed of a bus multiplied by the remaining time.
Instead of making multiple edges for each departure from a location, you can make multiple nodes per location / time.
Create one node per location per departure time.
Create one node per location per arrival time.
Create edges to connect departures to arrivals.
Create edges to connect a given node to the node belonging to the same location at the nearest future time.
By doing this, any path you can traverse through the graph is "valid" (meaning a traveler would be able to achieve this by a combination of bus trips or choosing to sit at a location and wait for a future bus).
Sorry to say, but as described this problem has a pretty high complexity. Misread the problem originally and thought it was np-hard, but it is not. It does however have a pretty high complexity that I personally would not want to deal with. This algorithm is a pretty good approximation that give a considerable complexity savings that I personally think it worth it.
However, if all you want is an answer that is "pretty good" there are are lot of fairly efficient algorithms out there that will get close very quickly.
Personally I would suggest using a simple greedy algorithm here.
I've done this on a few (granted, small and contrived) examples and it's worked pretty well and has an nlog(n) efficiency.
Associate a velocity with each node, velocity being the fastest you can move away from a given node. In my examples this velocity was distance_travelled/(wait_time + travel_time). I used the maximum velocity of all trips leaving a node as the velocity score for that node.
From your node/time calculate the velocities of all neighboring nodes and travel to the "fastest" node.
This algorithm is pretty good for the complexity as it basically transforms the problem into a static search, but there are a couple potential pitfalls that could be adjusted for depending on your data set.
The biggest issue with this algorithm is the possibility of a really fast bus going into the middle of nowhere. You could get around that by adding a "popularity" term to the velocity calculation (make more popular stops effectively faster) but depending on your data set that could easily make things either better or worse.
The simplistic graph representation will not work. I. e. each city is a node and the edges represent time. That's because the "edge" is not always active -- it is only active at certain times of the day.
The second thing that comes to mind is Edward Tufte's Paris Train Schedule which is a different kind of graph. But that does not quite fit the problem either. With the train schedule, the stations have a sequential relationship between stations, but that's not the case in general with cities and bus schedules.
But Tufte motivates the following way to model it as a graph. You could write code only to construct the graph and use a standard graph library that includes the shortest path algorithm.
Each bus trip is an edge with weight = distance covered
Each (city, departure) and (city, arrival) is a node
All nodes for a given city are connected by zero-weight edges in a time-ordered sequence, ignoring whether it is an arrival or a departure. This subgraph will look like a chain.
(it is a directed graph)
Linear Time Solution: Note that the graph will be a directed, acyclic graph. Finding the longest path in such a graph is linear. "A longest path between two given vertices s and t in a weighted graph G is the same thing as a shortest path in a graph −G derived from G by changing every weight to its negation. Therefore, if shortest paths can be found in −G, then longest paths can also be found in G."
Hope this helps! If somebody can post a visualization of the graph, it would be nice. If I can do so myself, I will do 1 more edit.
Naive is the best you'll get -- http://en.wikipedia.org/wiki/Longest_path_problem
EDIT:
So the problem is two fold.
Create a list of graphs where its possible to travel from pointA to pointB. Possible is in terms of times available for busA to travel from pointA to pointB.
Find longest path from all the possible generated path above.
Another approach would be to reevaluate the graph upon each node traversal and find the longest path.
It still reduces to finding longest possible path, which is NP-Hard.

How to find minimum number of transfers for a metro or railway network?

I am aware that Dijkstra's algorithm can find the minimum distance between two nodes (or in case of a metro - stations). My question though concerns finding the minimum number of transfers between two stations. Moreover, out of all the minimum transfer paths I want the one with the shortest time.
Now in order to find a minimum-transfer path I utilize a specialized BFS applied to metro lines, but it does not guarantee that the path found is the shortest among all other minimum-transfer paths.
I was thinking that perhaps modifying Dijkstra's algorithm might help - by heuristically adding weight (time) for each transfer, such that it would deter the algorithm from making transfer to a different line. But in this case I would need to find the transfer weights empirically.
Addition to the question:
I have been recommended to add a "penalty" to each time the algorithm wants to transfer to a different subway line. Here I explain some of my concerns about that.
I have put off this problem for a few days and got back to it today. After looking at the problem again it looks like doing Dijkstra algorithm on stations and figuring out where the transfer occurs is hard, it's not as obvious as one might think.
Here's an example:
If here I have a partial graph (just 4 stations) and their metro lines: A (red), B (red, blue), C (red), D (blue). Let station A be the source.
And the connections are :
---- D(blue) - B (blue, red) - A (red) - C (red) -----
If I follow the Dijkstra algorithm: initially I place A into the queue, then dequeue A in the 1st iteration and look at its neighbors :
B and C, I update their distances according to the weights A-B and A-C. Now even though B connects two lines, at this point I don't know
if I need to make a transfer at B, so I do not add the "penalty" for a transfer.
Let's say that the distance between A-B < A-C, which causes on the next iteration for B to be dequeued. Its neighbor is D and only at this
point I see that the transfer had to be made at B. But B has already been processed (dequeued). S
So I am not sure how this "delay" in determining the need for transfer would affect the integrity of the algorithm.
Any thoughts?
You can make each of your weights a pair: (# of transfers, time). You can add these weights in the obvious way, and compare them in lexicographic order (compare # of transfers first, use time as the tiebreaker).
Of course, as others have mentioned, using K * (# of transfers) + time for some large enough K produces the same effect as long as you know the maximum time apriori and you don't run out of bits in your weight storage.
I'm going to be describing my solution using the A* Algorithm, which I consider to be an extension (and an improvement -- please don't shoot me) of Dijkstra's Algorithm that is easier to intuitively understand. The basics of it goes like this:
Add the starting path to the priority queue, weighted by distance-so-far + minimum distance to goal
Every iteration, take the lowest weighted path and explode it into every path that is one step from it (discarding paths that wrap around themselves) and put it back into the queue. Stop if you find a path that ends in the goal.
Instead of making your weight simply distance-so-far + minimum-distance-to-goal, you could use two weights: Stops and Distance/Time, compared this way:
Basically, to compare:
Compare stops first, and report this comparison if possible (i.e., if they aren't the same)
If stops are equal, compare distance traveled
And sort your queue this way.
If you've ever played Mario Party, think of stops as Stars and distance as Coins. In the middle of the game, a person with two stars and ten coins is going to be above someone with one star and fifty coins.
Doing this guarantees that the first node you take out of your priority queue will be the level that has the least amount of stops possible.
You have the right idea, but you don't really need to find the transfer weights empirically -- you just have to ensure that the weight for a single transfer is greater than the weight for the longest possible travel time. You should be pretty safe if you give a transfer a weight equivalent to, say, a year of travel time.
As Amadan noted in a comment, it's all about creating right graph. I'll just describe it in more details.
Consider two vertexes (stations) to have edge if they are on a single line. With this graph (and weights 1) you will find minimum number of transitions with Dijkstra.
Now, lets assume that maximum travel time is always less 10000 (use your constant). Then, weight of edge AB (A and B are on one line) is a time_to_travel_between(A, B) + 10000.
Running Dijkstra on such graph will guarantee that minimal number of transitions is used and minimum time is reached in the second place.
update on comment
Let's "prove" it. There're two solution: with 2 transfers and 40 minutes travel time and with 3 transfers and 25 minutes travel time. In first case you travel on 3 lines, so path weight will be 3*10000 + 40. In second: 4*10000 + 25. First solution will be chosen.
I had the same problem as you, until now. I was using Dijkstra. The penalties for transfers is a very good idea indeed and I've been using it for a while now. The main problem is that you cannot use it directly in the weight as you first you have to identify the transfer. And I didn't want to modify the algorithm.
So what I'be been doing, is that each time and you find a transfer, delete the node, add it with the penalty weight and rerun the graph.
But this way I found out that Dijkstra wont work. And this is where I tried Floyd-Warshall which au contraire to Dijkstra compares all possible paths through the graph between each pair of vertices.
It helped me with my problem switching to Floyd-Warshall. Hope it helps you as well.
Its easier to code and lot more easier to implement.

Resources