This is a part of a self formulated question, and hence I have not been able to "Google" it and my own attempts have been futile till now.
You are given a graph G(V,E) each Node of V has a profit wi, each Edge of E has a cost of ci. We are now given a budget C, what is required to be found is a single path such that the sum of costs is less than C where sum of wi is maximum.Path has the normal definition here that is a path will not contain repeating vertices (simple path).
It is obvious that Hamiltonian path is a special case of this(Setting cost = |N-1| and the cost of each edge=1), and hence this is an NP Hard problem, so I am looking for approximation solutions, and heuristics.
Mathematically
Given Graph G(V,E)
ci >=0 for each edge e
wi >=0 for each vertex v
find a simple path P such that
Sum ci over all edges e in P <= C
Maximise Sum wi for all v in P
This is known as the Selective Travelling Salesman Problem, or Travelling Salesman with profits. Google Scholar should be able to give you some references. Metaheuristics such as genetic programming or tabu search are often used. If you want to solve the problem optimally, linear programming techniques would probably work (unfortunately, you don't state the size of the instances you're dealing with). If the length of the path is small (say 15 vertices), also color-coding might work.
One simple heuristic that cones to mind is a variation of stochastic hill climbing and greedy algorithm.
Define value function that is increasing in the weight and decreasing with the cost. For example:
value(u,v) = w(v) / [c(u,v) + epsilon]
(+ epsilon for the case of c(u,v) = 0)
Now, the idea is:
From a vertex u, proceed to vertex v with probability:
P(v|u) = value(u,v) / sum(u,x) [ for all feasible moves (u,x) ]
Repeat until you cannot continue.
This solution will give you one solution - quickly, but it is probably not near optimal. However - it is stochastic - you can always re-run it again and again, while you have time.
This will give you an anytime algorithm for this problem, meaning - the more time you have - the better your solution is.
Some optimizations:
You can try to learn macros to accelerate each search, which will result in more searches for each amount of time, and probably - better solutions.
Usually, the first search is not stochastic, and is purely greedy, following the max{value(u,v)}
Related
I am recently learning about graph algorithms and at my university we were taught, that the result of Bellman-Ford is a table of distances from all nodes to all other nodes (all-pairs shortest paths). However I did not understand how this is achieved by the algorithm and tried to understand it by watching YouTube videos and looking up definitions in Wikipedia and so forth...
Now here comes the problem:
I could not find resources that described the algorithm in a way that the result would be the all pairs shortest paths table, but only "from one node to all other nodes".
Can the Bellman-Ford algorithm be tweaked to achieve the all pairs shortest paths table or is my university-lecturer completely wrong about this? (He did explain some algorithm that delivered all pairs shortest paths and he called it Bellman-Ford, however I think this can not be Bellman Ford)
EDIT: I absolutely understand the Bellman-Ford algorithm for the Problem "shortest path from one node to all other nodes".
I also understand most of the algorithm that was taught at my university for "all pairs shortest paths".
I am just very confused since the algorithm at my university was also called "Bellman-Ford".
If you speak German: Here is a video where the university lecturer talks about his "Bellman-Ford" (which I think is not actually Bellman-Ford):
https://www.youtube.com/watch?v=3_zqU5GWo4w&t=715s
Bellman Ford is algorithm for finding shortest path from given start node to any other node in the graph.
Using Bellman Ford we can generate all pairs shortest paths if we run the bellman ford algorithm from each node and then get the shortest paths to all others, but the worse case time complexity of this algorithm will be O(V * V * E) and if we have complete graph this complexity will be O (V^4), where V is the number of vertexes (nodes) in the graph and E is the number of edges in the graph.
There is better algorithm for finding all pairs shortest paths which works in O(V^3) time complexity. That is the Floyd Warshall algorithm.
Here you can read more about it: https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm
The aim of the algorithm is to find the shortest path from the starting point to the ending point.
To do that, it finds the shortest distance from the all points to every other point and then selects the path that leads to the solution and also adds up to the shortest.
To begin with, it starts with the starting point (A). Sets every point's cost to infinity.
Now it sees all the possible directions from A. And A's initial cost is set to zero.
Imagine it needs to travel to only B. One there might be a straight path that connects B to A and its cost is say, 10.
But there is also a path via C. From A to C it takes say cost 5 and from C to B it takes cost only 2. This means that there are two possible paths from A to B. One that has cost 10 and the other 5+2 i.e. 7 . So it shall update the cost of reaching B from A to 7 not 10 and hence the path shall be selected.
You can imagine this same situation but with many more points. It shall search from starting point to reach the end point traversing all the possible paths and updating/not updating the cost as and when needed. In the end it shall look for all the paths and select the one that has the smallest cost.
Now here comes the problem: a I could not find resources that
described the algorithm in a way that the result would be the all
pairs shortest paths table, but only "from one node to all other
nodes".
To understand that, imagine we have to reach A to D.
The individual cost of moving from one point to another is listed below
A to B :15
A to C :10
B to C :3
B to D :5
C to D :15
Initially set all points to infinity except A to zero.
First,
A->B : Cost=15(Update B's cost to 15)
A->C : Cost=10(Update C's cost to 10)
B->C : Cost=18(B's cost plus B->C alone cost, so do not update as C's cost as is already smaller than this)
C->B : Cost=13(C's cost plus C->B alone cost, update B's cost to 13 as this is smaller than 15)
B->D : Cost=18(B's new cost plus B->D cost alone, update D's cost as this smaller than infinity)
C->D : Cost=25(C's cost plus C->D cost alone, do not update D)
So the path that the algorithm chooses is the one that lead to D with the cost of 18 which comes out to be the smallest cost!
B
/ | \
A | D
\ | /
C
A->C->B->D Cost:18
Now you may read this link for better understanding. And things should be pretty clear.
I asked in our university forum and got the following answer:
Bellman-Ford is originally "from one node". The invariant (idea under the hood of the algorithm) however does not change when applying the original Bellman-Ford algorithm to every node of the Graph.
Complexity of the original Bellman-Ford is O(V^3) and if started from every Node it would be O(V^4). However there is a trick that one can use because the findings during the algorithm resemble matrix multiplications of the input matrix (containing direct path lengths) with itself. Because this is a mathematical ring one can cheat and simply calculate matrix^2, matrix^4, matrix^8 and so on (This is the part I did not completely understand though) and one can achieve O(V^3 * log V).
He called this algorithm Bellman-Ford as well, because the invariant/ idea behind the algorithm is still the same.
German answer in our public university forum
I implemented a back tracing algorithm using both a greedy algorithm and a back tracking algorithm.
The back tracking algorithm is as follows:
MIS(G= (V,E): a graph): largest set of independent vertices
1:if|V|= 0
then return .
3:end if
if | V|= 1
then return V
end if
pick u ∈ V
Gout←G−{u}{remove u from V and E }
Gn ← G−{ u}−N(u){N(u) are the neighbors of u}
Sout ←MIS(Gout)
Sin←MIS(Gin)∪{u}
return maxsize(Sout,Sin){return Sin if there’s a tie — there’s a reason for this.
}
The greedy algorithm is to iteratively pick the node with the smallest degree, place it in the MIS and then remove it and its neighbors from G.
After running the algorithm on varying graph sizes where the probability of an edge existing is 0.5, I have empirically found that the back tracking algorithm always found a smaller a smaller maximum independent set than the greedy algorithm. Is this expected?
Your solution is strange. Backtracking is usually used to yes/no problems, not optimization. The algorithm you wrote depends heavily on how you pick u. And it definitely is not backtracking because you never backtrack.
Such problem can be solved in a number of ways, e.g.:
genetic programming,
exhaustive searching,
solving the problem on dual graph (maximum clique problem).
According to Wikipedia, this is a NP-hard problem:
A maximum independent set is an independent set of the largest possible size for a given graph G.
This size is called the independence number of G, and denoted α(G).
The problem of finding such a set is called the maximum independent set problem and is an NP-hard optimization problem.
As such, it is unlikely that there exists an efficient algorithm for finding a maximum independent set of a graph.
So, for finding the maximum independent set of a graph, you should test all available states (with an algorithm which its time complexity is exponential). All other faster algorithms (like greedy, genetic or randomize ones), can not find the exact answer. They can guarantee to find a maximal independent set, but not the maximum one.
In conclusion, I can say that your backtracking approach is slower and accurate; but the greedy approach is only an approximation algorithm.
For n stations a n*n matrix A is given such that A[i][j] represents time of direct journey from station i to j (i,j <= n).
The person travelling between stations always seeks least time. Given two station numbers a, b, how to proceed about calculating minimum time of travel between them?
Can this problem be solved without using graph theory, i.e. just by matrix A alone?
You do need graph theory in order to solve it - more specifically, you need Dijkstra's algorithm. Representing the graph as a matrix is neither an advantage nor a disadvantage to that algorithm.
Note, though, that Dijkstra's algorithm requires all distances to be nonnegative. If for some reason you have negative "distances" in your matrix, you must use the slower Bellman-Ford algorithm instead.
(If you're really keen on using matrix operations and don't mind that it will be terribly slow, you could use the Floyd-Warshall algorithm, which is based on quite simple matrix operations, to compute the shortest paths between all pairs of stations (massive overkill), and then pick the pair you're interested in...)
This looks strikingly similar to the traveling salesman problem which is NP hard.
Wiki link to TSP
FIRST,
The ideal path was (in order of importance):
1. shortest
My heuristic (f) was:
manhattan distance (h) + path length (g)
This was buggy because it favored paths which veered towards the target then snaked back.
SECOND,
The ideal path was:
1. shortest
2. approaches the destination from the same y coordinate (if of equal length)
My heuristic stayed the same. I checked for the second criteria at the end, after reaching the target. The heuristic was made slightly inefficient (to fix the veering problem) which also resulted in the necessary adjacent coordinates always being searched.
THIRD,
The ideal path:
1. shortest
2. approaches the destination from the same y coordinate (if of equal length)
3. takes the least number of turns
Now I tried making the heuristic (f):
manhattan distance (h) + path length (g) * number of turns (t)
This of course works for criteria #1 and #3, and fixes the veering problem inherently. Unfortunately it's now so efficient that testing for criteria #2 at the end is not working because the set of nodes explored is not large enough to reconstruct the optimal solution.
Can anyone advise me how to fit criteria #2 into my heuristic (f), or how else to tackle this problem?
CRITERIA 2 example: If the goal is (4,6) and the paths to (3,6) and (4,5) are of identical length, then the ideal solution should go through (3,6) because it approaches from the Y plane instead, of (4,5) which comes from the X plane. However if the length is not identical, then the shortest path must be favored regardless of what plane it approaches in.
You seem to be confusing the A* heuristic, what Russell & Norvig call h, with the partial path cost g. Together, these constitute the priority f = g + h.
The heuristic should be an optimistic estimate of how much it costs to reach the goal from the current point. Manhattan distance is appropriate for h if steps go up, down, left and right and take at least unit cost.
Your criterion 2, however, should go in the path cost g, not in h. I'm not sure what exactly you mean by "approaches the destination from the same y coordinate", but you can forbid/penalize entry into the goal node by giving all other approaches an infinite or very high path cost. There's strictly no need to modify the heuristic h.
The number of turns taken so far should also go in the partial path cost g. You may want to include in h an (optimistic) estimate of how many turns there are left to take, if you can compute such a figure cheaply.
Answering my own question with somewhat of a HACK. Still interested in other answers, ideas, comments, if you know of a better way to solve this.
Hacked manhattan distance is calculated towards the nearest square in the Y plane, instead of the destination itself:
dy = min(absolute_value(dy), absolute_value(dy-1));
Then when constructing heuristic (f):
h = hacked_manhattan_distance();
if (h < 2)
// we are beside close to the goal
// switch back to real distance
h = real_manhattan_distance();
I'm searching for an algorithm to find pairs of adjacent nodes on a hexagonal (honeycomb) graph that minimizes a cost function.
each node is connected to three adjacent nodes
each node "i" should be paired with exactly one neighbor node "j".
each pair of nodes defines a cost function
c = pairCost( i, j )
The total cost is then computed as
totalCost = 1/2 sum_{i=1:N} ( pairCost(i, pair(i) ) )
Where pair(i) returns the index of the node that "i" is paired with. (The sum is divided by two because the sum counts each node twice). My question is, how do I find node pairs that minimize the totalCost?
The linked image should make it clearer what a solution would look like (thick red line indicates a pairing):
Some further notes:
I don't really care about the outmost nodes
My cost function is something like || v(i) - v(j) || (distance between vectors associated with the nodes)
I'm guessing the problem might be NP-hard, but I don't really need the truly optimal solution, a good one would suffice.
Naive algos tend to get nodes that are "locked in", i.e. all their neighbors are taken.
Note: I'm not familiar with the usual nomenclature in this field (is it graph theory?). If you could help with that, then maybe that could enable me to search for a solution in the literature.
This is an instance of the maximum weight matching problem in a general graph - of course you'll have to negate your weights to make it a minimum weight matching problem. Edmonds's paths, trees and flowers algorithm (Wikipedia link) solves this for you (there is also a public Python implementation). The naive implementation is O(n4) for n vertices, but it can be pushed down to O(n1/2m) for n vertices and m edges using the algorithm of Micali and Vazirani (sorry, couldn't find a PDF for that).
This seems related to the minimum edge cover problem, with the additional constraint that there can only be one edge per node, and that you're trying to minimize the cost rather than the number of edges. Maybe you can find some answers by searching for that phrase.
Failing that, your problem can be phrased as an integer linear programming problem, which is NP-complete, which means that you might get dreadful performance for even medium-sized problems. (This does not necessarily mean that the problem itself is NP-complete, though.)