I have a set of nodes. Travel cost from one node to a connected node is always 1, but not all nodes are connected directly. That is to say, travel from node A to C may required passing through node B, and it's total travel cost would be 2.
I then have a set of ordered pair waypoints. Each waypoint pair contains an origin node and destination node, which must be visited in order.
The ordered pairs themselves do not have to be visited in any particular order, nor must the destination node be visited immediately following the origin node.
A node may be visited twice, if that were to optimize the overall route. It should never need to visited thrice.
How can I order my nodes to achieve a minimum travel cost and ensure all nodes contained in a waypoint are visited, and adhere to the ordered pair rule above?
I'm banging my head against the wall with this.
not a complete answer, just thinking out loud, a sort of greedy approach:
compute the shortest distances matrix for the graph,
find the pair of waypoints for which the shortest route is the longest.
append that route to the final result.
remove any pair of waypoints that appears in that route from the set of waypoints that are yet to be handled.
repeat from 2 for the remaining waypoints until there are none left.
if you encounter a pair of waypoints which appear in a route in reverse order, need to come up with an 'efficient' way to traverse the route in reverse as well.
another idea:
find the minimal spanning tree of the graph and traverse it left to right and right to left.
Unfortunately this is an NP-Hard problem as it reduces from the (Metric) Traveling Salesman Problem.
This means that in the general case, it cannot be done in polynomial time. There are several approaches you can take
Accept superpolynomial running times
Relax the requirements, so it is no longer NP Complete
Use an approximation algorithm
Constrain your graphs to a set that can be solved efficiently (such as bounded branchwidth graphs)
Or you could use a combination of approaches. For example, use an exact algorithm that solves common cases quickly, and fall back to an approximation algorithm in pathological cases.
Related
I have questions about an optimal algorithm problem on a weighted graph. I am given an edgelist with weights, a list with savepoints, a starte- and end- node and the max distance for a step.
The output should be a list of savepoints, which are accessible in one step from starting- and end- node.
I thought of some kind of dijkstra's algorithm from each point of the list of savepoints.
I'm not sure if that's a good idea, since if I have many savepoints I calculate a lot of paths multiple times. Every idea/help is welcome!
Thank you very much in advance!
You have to have the condition that a weight cannot be negative, otherwise the problem becomes very intractable. Otherwise it's just a breadth first search, with marking the distance for every visited node. So you don't revisit a node is a previous move has visited it earlier at lower cost.
You keep a priority queue of all active nodes, so you are checking the lowest cost node each time. The priority queue is in fact the hardest part to get right. If you check the A* algorithm for my binary image library https://github.com/MalcolmMcLean/binaryimagelibrary you can take the priority queue for there. A* over a maze is very similar to shortest path over a graph, but you don't have a heuristic because you must have the exact shortest path, and instead of 4 / 8 edges per tile, you have nodes with arbitrary numbers of connections.
Working on an algorithm for a game I am developing with a friend and we got stuck. Currently, we have a cyclic undirected graph, and we are trying to find the quickest path from starting node S that covers every edge. We are not looking for a tour and there can be repeated edges.
Any ideas on an algorithm or approximation? I'm sure this problem is NP-hard, but I don't believe it's TSP.
Route Inspection
This is known as the route inspection problem and it does have a polynomial solution.
The basic idea (see the link for more details) is that it is easy to solve for an Eulerian path (where we visit every edge once), but an Eulerian path is only possible for certain graphs.
In particular, a graph has to be connected and have either 0 or 2 vertices of odd degree.
However, it is possible to generalise this for other graphs by adding additional edges in the cheapest way that will produce a graph that does have an Eulerian path. (Note that we have added more edges so we may travel multiple times over edges in the original graph.)
The way of choosing the best way to add additional edges is a maximal matching problem that can be solved in O(n^3).
P.S.
Concidentally I wrote a simple demo earlier today (link to game) for a planar max-cut problem. The solution to this turns out to be based on exactly the same route inspection problem :)
EDIT
I just spotted from the comments that in your particular case your graph may be a tree.
If so, then I believe the answer is much simpler as you just need to do a DFS over the tree making sure to visit the shallowest subtree first.
For example, suppose you have a tree with edges S->A and S->A->B. S has two subtrees, and you should visit A first because it is shallower.
The total edges visited will equal the number of edges visited in a full DFS, minus the depth of the last leaf visited, which is why to minimise the total edges you want to maximise the depth of the last leaf, and hence visit the shallowest subtree first.
This is somewhat like the Eulerian Path. The main distinction is that there may be dead-ends and you may be able to modify the algorithm to suit your needs. Pruning dead-ends is one option or you may be able to reduce the graph into a number of connected components.
DFS will work here. However you must have a good evaluation function to prun the branch early. Otherwise you can not solve this problem fast. You can refer to my discussion and implementation in Java here http://www.capacode.com/?p=650
Detail of my evaluation function
My first try is if the length of the current path plus the distance from U to G is not shorter than the minimum length (stored in minLength variable) we found, we will not visit U next because it can not lead a shorter path.
Actually, the above evaluation function is not efficient because it only works when we already visit most of the cities. We need to compute more precise the minimum length to reach G with all cities visited.
Assume s is the length from S to U, from U to visit G and pass all cities, the length is at least s’ = s + ∑ minDistance(K) where K is an unvisited city and different from U; minDistance(K) is the minimum distance from K to an unvisited state. Basically, for each unvisited state, we assume that we can reach that city with the shortest edge. Note that those shortest edges may not compose a valid path. Then, we will not visit U if s’ ≥ minLength.
With that evaluation function, my program can handle the problem with 20 cities within 1 second. I also add another optimization to improve the performance more. Before running the program, I use greedy algorithm to get a good value for minLength. Specifically, for each city, we will visit the nearest city next. The reason is when we have a smaller minLength, we can prun more.
Or will I need to develop an algorithm for every unique graph? The user is given a type of graph, and they are then supposed to use the interface to add nodes and edges to an initial graph. Then they submit the graph and the algorithm is supposed to confirm whether the user's graph matches the given graph.
The algorithm needs to confirm not only the neighbours of each node, but also that each node and each edge has the correct value. The initial graphs will always have a root node, which is where the algorithm can start from.
I am wondering if I can develop the logic for such an algorithm in the general sense, or will I need to actually code a unique algorithm for each unique graph. It isn't a big deal if it's the latter case, since I only have about 20 unique graphs.
Thanks. I hope I was clear.
Graph isomorphism problem might not be hard. But it's very hard to prove this problem is not hard.
There are three possibilities for this problem.
1. Graph isomorphism problem is NP-hard.
2. Graph isomorphism problem has a polynomial time solution.
3. Graph isomorphism problem is neither NP-hard or P.
If two graphs are isomorphic, then there exist a permutation for this isomorphism. Take this permutation as a certificate, we could prove this two graphs are isomorphic to each other in polynomial time. Thus, graph isomorphism lies in the territory of NP set. However, it has been more than 30 years that no one could prove whether this problem is NP-hard or P. Thus, this problem is intrinsically hard despite its simple problem description.
If I understand the question properly, you can have ONE single algorithm, which will work by accepting one of several reference graphs as its input (in addition to the input of the unknown graph which isomorphism with the reference graph is to be asserted).
It appears that you seek to assert whether a given graph is exactly identical to another graph rather than asserting if the graphs are isomorph relative to a particular set of operations or characteristics. This implies that the algorithm be supplied some specific reference graph, rather than working off some set of "abstract" rules such as whether neither graphs have loops, or both graphs are fully connected etc. even though the graphs may differ in some other fashion.
Edit, following confirmation that:
Yeah, the algorithm would be supplied a reference graph (which is the answer), and will then check the user's graph to see if it is isomorphic (including the values of edges and nodes) to the reference
In that case, yes, it is quite possible to develop a relatively simple algorithm which would assert isomorphism of these two graphs. Note that the considerations mentioned in other remarks and answers and relative to the fact that the problem may be NP-Hard are merely indicative that a simple algorithm [or any algorithm for that matter] may not be sufficient to solve the problem in a reasonable amount of time for graphs which size and complexity are too big. However, assuming relatively small graphs and taking advantage (!) of the requirement that the weights of edges and nodes also need to match, the following algorithm should generally be applicable.
General idea:
For each sub-graph that is disconnected from the rest of the graph, identify one (or possibly several) node(s) in the user graph which must match a particular node of the reference graph. By following the paths from this node [in an orderly fashion, more on this below], assert the identity of other nodes and/or determine that there are some nodes which cannot be matched (and hence that the two structures are not isomorphic).
Rough pseudo code:
1. For both the reference and the user supplied graph, make the the list of their Connected Components i.e. the list of sub-graphs therein which are disconnected from the rest of the graph. Finding these connected components is done by following either a breadth-first or a depth-first path from starting at a given node and "marking" all nodes on that path with an arbitrary [typically incremental] element ID number. Once a given path has been fully visited, repeat the operation from any other non-marked node, and do so until there are no more non-marked nodes.
2. Build a "database" of the characteristics of each graph.
This will be useful to identify matching candidates and also to determine, early on, instances of non-isomorphism.
Each "database" would have two kinds of "records" : node and edge, with the following fields, respectively:
- node_id, Connected_element_Id, node weight, number of outgoing edges, number of incoming edges, sum of outgoing edges weights, sum of incoming edges weight.
node
- edge_id, Connected_element_Id, edge weight, node_id_of_start, node_id_of_end, weight_of_start_node, weight_of_end_node
3. Build a database of the Connected elements of each graph
Each record should have the following fields: Connected_element_id, number of nodes, number of edges, sum of node weights, sum of edge weights.
4. [optionally] Dispatch the easy cases of non-isomorphism:
4.a mismatch of the number of connected elements
4.b mismatch of of number of connected elements, grouped-by all fields but the id (number of nodes, number of edges, sum of nodes weights, sum of edges weights)
5. For each connected element in the reference graph
5.1 Identify candidates for the matching connected element in the user-supplied graph. The candidates must have the same connected element characteristics (number of nodes, number of edges, sum of nodes weights, sum of edges weights) and contain the same list of nodes and edges, again, counted by grouping by all characteristics but the id.
5.2 For each candidate, finalize its confirmation as an isomorph graph relative to the corresponding connected element in the reference graph. This is done by starting at a candidate node-match, i.e. a node, hopefully unique which has the exact same characteristics on both graphs. In case there is not such a node, one needs to disqualify each possible candidate until isomorphism can be confirmed (or all candidates are exhausted). For the candidate node match, walk the graph, in, say, breadth first, and by finding matches for the other nodes, on the basis of the direction and weight of the edges and weight of the nodes.
The main tricks with this algorithm is are to keep proper accounting of the candidates (whether candidate connected element at higher level or candidate node, at lower level), and to also remember and mark other identified items as such (and un-mark them if somehow the hypothetical candidate eventually proves to not be feasible.)
I realize the above falls short of a formal algorithm description, but that should give you an idea of what is required and possibly a starting point, would you decide to implement it.
You can remark that the requirement of matching nodes and edges weights may appear to be an added difficulty for asserting isomorphism, effectively simplify the algorithm because the underlying node/edge characteristics render these more unique and hence make it more likely that the algorithm will a) find unique node candidates and b) either quickly find other candidates on the path and/or quickly assert non-isomorphism.
I want to calculate the most profitable route and I think this is a type of traveling salesman problem.
I have a set of nodes that I can visit and a function to calculate cost for traveling between nodes and points for reaching the nodes. The goal is to reach a fixed known score while minimizing the cost.
This cost and rewards are not fixed and depend on the nodes visited before.
The starting node is fixed.
There are some restrictions on how nodes can be visited. Some simplified examples include:
Node B can only be visited after A
After node C has been visited, D or E can be visited. Visiting at least one is required, visiting both is permissible.
Z can only be visited after at least 5 other nodes have been visited
Once 50 nodes have been visited, the nodes A-M will no longer reward points
Certain nodes can (and probably must) be visited multiple times
Currently I can think of only two ways to solve this:
a) Genetic Algorithms, with the fitness function calculating the cost/benefit of the generated route
b) Dijkstra search through the graph, since the starting node is fixed, although the large number of nodes will probably make that not feasible memory wise.
Are there any other ways to determine the best route through the graph? It doesn't need to be perfect, an approximated path is perfectly fine, as long as it's error acceptable.
Would TSP-solvers be an option here?
With this much weird variation and path-dependence, what you're actually searching is not the graph itself, but the space of paths from the root, which is a tree. If the problem is as general as you say, you're not going to be able to do better than directly searching the "tree-of-paths", saving the best value and the corresponding path. If you can transform it into any way so that there is no such path-dependence, you should probably do so.
If you can't, there are two basic options: breadth-first, which will return the paths in order of length, but at the cost of high memory usage, as there are many temporary paths that must be stored. Depth-first search only needs to store a single path (which can be done entirely as a series of recursive calls), but has no natural stopping point, and is not guaranteed to actually terminate if there is no upper bound on the path size.
If you're lucky enough that the cost increases monotonically with each additional step, you can instead order by cost. The first one that's good enough is the one you then want. Breadth firs search is sometimes implemented by putting the paths to explore on a queue. Change this to a priority queue based on the cost, and you now have a "cost first search", known formally as Uniform-cost search.
If the cost function can decrease by adding on the path, A* search can be modified to do the search, but you no longer have the guarantee that you can stop early.
There is a custom implementation of KSPA which needs to be re-written. The current implementation uses a modified Dijkstra's algorithm whose pseudocode is roughly explained below. It is commonly known as KSPA using edge-deletion strategy i think so. (i am a novice in graph-theory).
Step:-1. Calculate the shortest path between any given pair of nodes using the Dijkstra algorithm. k = 0 here.
Step:-2. Set k = 1
Step:-3. Extract all the edges from all the ‘k-1’ shortest path trees. Add the same to a linked list Edge_List.
Step:-4. Create a combination of ‘k’ edges from Edge_List to be deleted at once such that each edge belongs to a different SPT (Shortest Path Tree). This can be done by inspecting the ‘k’ value for each edge of the combination considered. The ‘k’ value has to be different for each of the edge of the chosen combination.
Step:-5. Delete the combination of edges chosen in the above step temporarily from the graph in memory.
Step:-6. Re-run Dijkstra for the same pair of nodes as in Step:-1.
Step:-7. Add the resulting path into a temporary list of paths. Paths_List.
Step:-8. Restore the deleted edges back into the graph.
Step:-9. Go to Step:-4 to get another combination of edges for deletion until all unique combinations are exhausted. This is nothing but choosing ‘r’ edges at a time among ‘n’ edges => nCr.
Step:-10. The ‘k+1’ th shortest path is = Minimum(Paths_List).
Step:-11. k = k + 1 Go to Step:-3, until k < N.
Step:-12. STOP
As i understand the algorithm, to get kth shortest path, ‘k-1’ SPTs are to be found between each source-destination pair and ‘k-1’ edges each from one SPT are to be deleted simultaneously for every combination.
Clearly this algorithm has combinatorial complexity and clogs the server on large graphs. People suggested me Eppstein's algorithm (http://www.ics.uci.edu/~eppstein/pubs/Epp-SJC-98.pdf). But this white paper cites a 'digraph' and I did not see a mention that it works only for digraphs. I just wanted to ask folks here if anyone has used this algorithm on an undirected graph?
If not, are there good algorithms (in terms of time-complexity) to implement KSPA on an undirected graph?
Thanks in advance,
Time complexity: O(K*(E*log(K)+V*log(V)))
Memory complexity of O(K*V) (+O(E) for storing the input).
We perform a modified Djikstra as follows:
For each node, instead of keeping the best currently-known cost of route from start-node. We keep the best K routes from start node
When updating a nodes' neighbours, we don't check if it improves the best currently known path (like Djikstra does), we check if it improves the worst of the K' best currently known path.
After we already processed the first of a nodes' K best routes, we don't need to find K best routes, but only have K-1 remaining, and after another one K-2. That's what I called K'.
For each node we will keep two priority queues for the K' best currently known path-lengths.
In one priority queue the shortest path is on top. We use this priority queue to determine which of the K' is best and will be used in the regular Djikstra's priority queues as the node's representative.
In the other priority queue the longest path is on top. We use this one to compare candidate paths to the worst of the K' paths.