simple path in Graph with special nodes with max amount of special nodes - algorithm

I have this problem:
Given a graph G = (V,E), that has a subset of the nodes called, R, that are "special" nodes. The amount of special nodes can very from case to case. The graph can be directed, undirected, does not have weights, and can contain cycles.
Now, I need a algorithm that can find a path from a node s to a node t, that passes through a maximum amount of the "special" nodes in R.
Im aware that this problem is np-hard, and is easily reducible from hamiltionian path, but I i've been looking for different ways of solving it without having to bruteforce all paths.
First attempt
First I tried doing some preprocessing of the graph, where every edge that goes to a "normal" node, gets a weight of 2, and every edge to a node in R gets a weight of 0.
Then I would just run dijkstra on the graph.
A counterexample this could however look like this:
In this graph, dijkstra would pick path [s,4,t] even though path [s,1,2,3,t] is a actual simple path with the maximum amount of red nodes
Second Attempt
My second attempt was a bit more convoluted. In this attempt I would run a bfs from the s-node and each R-node in the graph. I would then createa a new reachabilitygraph that could model which R-nodes that are connected to each other.
This approach would run into major issues in any graph that has cycles or is not directed, as connections between R-nodes that did not exist in the original graph would be included in the new graph.
So if anyone has any bids on any smart preprocessing steps that I could take, I would be cery happy

Your first method seems good, for example:
weigh of all edges to some node v in R = 1
weigh of rest of edges = 0
Then run Dijkstra with a cutoff = max special nodes

Related

Shortest path passing though some defined nodes

In a directed graph, find the shortest path from s to t such that the path passes through a certain subset of V, let's call them death nodes. The algorithm is given a number n, while traversing from s to t, the path cannot pass though more than n death nodes. What is the best way to find the shortest path, her? I am thiniing Dijkstra's, but how to make sure we are not passing though more than n nodes? Please help me tweak Dijkstra's to include this condition.
Small n
If n is small you can make n copies of your graph, call them levels 1 to n.
You start at s in level 1. If you are at a normal node, the edges take you to nodes within the same level. If you are at a death node, the edges take you to nodes within the next level. If you are at a death node on level n, the edges are simply omitted.
Also connect the t nodes at all levels to a new single destination T (with zero weight).
Then compute the shortest path from s to T.
The problem with this approach is that the graph size goes up by a factor of n, so it is only appropriate for small n.
Large n
An alternative approach is to increase the weight for each edge leaving a death node by a variable x.
As you increase the variable x, the shortest path will use fewer and fewer death nodes. Adjust the value for x (e.g. with bisection) until the graph only uses n death nodes.
This should take around O(logn) evaluations of the shortest path.
I'd add the number of dead nodes encountered on the way as a new (sparse) dimension to the computed distance -- basically you'd have up to n best distances per node.
Implementing your own BFS would be similar: You'll need to treat "seen with x dead nodes" different from "seen with y dead nodes" for each node, unless the total distance and number of dead nodes on the way are both smaller.
p.s.: If you get stuck with this approach, please post code so far O:)

Dijkstra with Parallel edges and self-loop

If I have a weighted undirected Graph with no negative weights, but can contain multiple edges between vertex and self-loops, Can I run Dijkstra algorithm without problem to find the minimum path between a source and a destination or exists a counterexample?
My guess is that there is not problem, but I want to be sure.
If you're going to run Dijkstra's algorithm without making any changes to he graph, there's a chance that you'll not get the shortest path between source and destination.
For example, consider S and O. Now, finding the shortest path really depends on which edge is being being traversed when you want to push O to the queue. If your code picks edge with weight 1, you're fine. But if your code picks the edge with weight 8, then your algorithm is going to give you the wrong answer.
This means that the algorithm's correctness is now dependent on the order of edges entered in the adjacency list of the source node.
You can trivially transform your graph to one without single-edge loops and parallel edges.
With single-edge loops you need to check whether their weight is negative or non-negative. If the weight is negative, there obviously is no shortest path, as you can keep spinning in place and reduce your path length beyond any limit. If however the weight is positive, you can throw that edge away, as no shortest path can go through that edge.
A zero-weight edge would create a similar problem than any zero-weight loop: there will be not one but an infinite number of shortest paths, going through the same loop over and over again. In these cases the sensible thing is again to remove the edge from the graph.
Out of the parallel edges you can throw away all but the one with the lowest weight. The reasoning for this is equally simple: if there was a shortest path going through an edge A that has a parallel edge B with lower weight, you could construct an even shorter path by simply replacing A with B. Therefore no shortest path can go through A.
It just needs a minor variation. If there are multiple edges directed from u to v and each edge has a different weight, you can either:
Pick the weight with least edge for relaxation; or
Run relaxation for each edge.
Both of the above will have the same complexity although the constant factors in #2 will have higher values.
In any case you'll need to make sure that you evaluate all edges between u and v before moving to the next adjacent node of u.
I don't think it will create any kind of problem.As the dijkstra algorithm will use priority queue ,so offcourse minimum value will get update first.

Tricky algorithm for finding alternative route in graph with few added edges

Okay, so I found this a bit tricky.
Basically, you have a directed graph (let's call it the base graph), that has some leaves and a node with 0 indegree that is called root. It may contain cycles.
From that base graph, a tree has been made, that contains the root and all leaves, and some connection between them. The nodes and edges that are not needed to connect the root to the leaves are left out.
Now imagine one or more edges in the tree "break", and can no longer be used. The problem now is to
a) If possible, find an alternative route(s) to the disconnected node(s), introducing as few previously unused edges from the base graph as possible.
b) If not possible, select which edges to "repair", repairing as few edges as possible to get all leaves connected again.
This is supposed to represent an electrical grid, and the breaks are power outages.
If just one edge is broken, it is easy enough. But say you have a graph with 100 leaves, 500 edges, and 50 edges break. Now to find which combination of adding previously unused edges from the base graph, and if necessary repairing some edges, to connect all leaves, seems like a very hard problem.
I imagined one could do some sort of brute force, where ALL combinations of unused edges, from using 1 to all of them, are tested. Or if repairs are needed, testing ALL combinations of repairs with all combinations of new edges. When the amount of edges get high, this seems to me very very inefficient.
My question is, does anyone have any smart ideas to how this could be done in a more efficient way? I hope I explained it well enough.
This is an NP-hard problem, and I'll explain why. Imagine that you have three layers of nodes: the root node, a layer of intermediate connecting nodes, and then a layer of leaf nodes. Edges go from root to intermediate nodes, and from an intermediate node to some subset of leaf nodes. Suppose you have some choice of intermediate nodes and edges to leaf nodes that gives you a connected tree graph, where each intermediate node has an edge to only one leaf node. Now imagine all edges in the reduced graph are removed. Then to find the minimum number of edges needed to add to repair the graph, this is equivalent to finding the minimum number of remaining intermediate nodes whose edges cover all of the leaf nodes. This is equivalent to the set cover problem for the leaf nodes http://en.wikipedia.org/wiki/Set_cover_problem and is NP-hard. Thus there is almost certainly no fast algorithm for your problem in the worst case (unless P = NP). Maybe if you bound the number of edges that are removed, you can come up with a polynomial time algorithm where the exponent in the polynomial depends (hopefully weakly) on how many edges were removed.
Seems like the start to a good efficient heuristic/solution would be to weight the edges. A couple simple approaches (not the most space efficient) as to how you could weight the edges based on the total number of edges are listed below.
If using any number of undamaged edges is better than using a single alternative edge and using any number of alternative edges is better than a single damaged edge.
Undamaged edge: 1
Alternative edge: E
Damaged edge: E^2
In the case of 100 vertices and 500 edges, alternative edges would be weighted as 500, while damaged edges would be weighted as 250000.
If using any number of undamaged edges is better than using a single alternative edge or a single damaged edge.
Undamaged edge: 1
Alternative/damaged edge: E
In the case of 100 vertices and 500 edges, alternative/damaged edges would be weighted as 500.
It seems like you then try a number of approaches to find either the exact solution or a heuristic result. The main suggestion I have for an algorithm is below.
Find the directed minimium spanning tree. If you use the weighting listed above, then I believe the result is optimum if I'm understanding things correctly.
Although, if you have intermediate nodes (nodes that are neither the root or a leaf), then this is likely to result in an overestimating heuristic. In which case, you might be able to get around it by running all pairs all shortest paths first and use the path costs for that as input for the directed minimium spanning tree algorithm, but that's probably a heuristic as well.

Finding a path in which the difference between min edge cost and max edge cost is minimum across all routes

We have to find the route from a source to a sink in a graph in which the difference between the maximum costing edge and the minimum costing edge is minimum.
I tried using a recursive solution but it would fail in condition of cycles and a modified dijkstra which also failed.
Is there an algorithm where i will not have to find all routes and then find the minimum?
Sort the edges by weight (in non-decreasing order), then for each edge do the next: Add the next edges (the ones with greater or equal weight in non-decreasing order) until the source and sink are connected, then update your answer with de difference between the last edge, like this:
ans = INFINITE
for each Edge e1 in E (sorted by weight in non-decreasing order)
clear the temporal graph
for each Edge e2 in E
if e2.weight >= e1.weight
add e2 to the temporal graph
if sink and source are connected in the temporal graph
ans = min(ans, e2.weight - e1.weight)
break
print ans
If you use the UNION-FIND structure to add edges and check connectivity between source and sink you should get an overall time of O(edges^2)
OK, so you have a graph, and you need to find a path from one node (source) to another (sink) such that max edge weight on the path minus min edge weight on the path is minimized. You don't say whether your graph is directed, can have negative edge weights, or has cycles, so lets assume a "yes" answer to all these questions.
When computing your path "score" (maximum difference between edge weights), we observe these are similar to path distances: you could have a path from A to B that scores higher (undesirable) or lower (desirable). If we treat path scores like path weights we observe that as we build a path by adding new edges, the path score (=weight) can only increase: given a path A->B->C where weight(A->B)=1 and weight(B->C)=5, yielding a path score of 4, if I add edge C->D to the path A->B->C, the path score can only increase or stay the same: the difference between minimum edge weight and maximum edge weight will not be lower than 4.
The conclusion from all of this is that we can explore the graph looking for best paths as if we are looking for optimal path in a graph with no negative edges. However, there could be (and likely to be, given the connectivity described) cycles. This means that Dijkstra's algorithm, properly implemented, will have optimal performance relative to this graph's topology given what we know today.
No solution without full graph exploration
One may be misled into thinking that we can make locally good decisions about which edge should belong to the optimal path without exploring the whole graph. The following subgraph illustrates the problem with this:
Say you've need a path from A to F, and you are at node B. Given everything you know, you'd choose a path to C, as it minimizes the path score. What you don't know (yet) is that the next edge in that path will cause the score of this path to increase substantially. If you knew that you would have chosen the edge B->D as a next element in an optimal path.

How to detect if the given graph has a cycle containing all of its nodes? Does the suggested algorithm have any flaws?

I have a connected, non-directed, graph with N nodes and 2N-3 edges. You can consider the graph as it is built onto an existing initial graph, which has 3 nodes and 3 edges. Every node added onto the graph and has 2 connections with the existing nodes in the graph. When all nodes are added to the graph (N-3 nodes added in total), the final graph is constructed.
Originally I'm asked, what is the maximum number of nodes in this graph that can be visited exactly once (except for the initial node), i.e., what is the maximum number of nodes contained in the largest Hamiltonian path of the given graph? (Okay, saying largest Hamiltonian path is not a valid phrase, but considering the question's nature, I need to find a max. number of nodes that are visited once and the trip ends at the initial node. I thought it can be considered as a sub-graph which is Hamiltonian, and consists max. number of nodes, thus largest possible Hamiltonian path).
Since i'm not asked to find a path, I should check if a hamiltonian path exists for given number of nodes first. I know that planar graphs and cycle graphs (Cn) are hamiltonian graphs (I also know Ore's theorem for Hamiltonian graphs, but the graph I will be working on will not be a dense graph with a great probability, thus making Ore's theorem pretty much useless in my case). Therefore I need to find an algorithm for checking if the graph is cycle graph, i.e. does there exist a cycle which contains all the nodes of the given graph.
Since DFS is used for detecting cycles, I thought some minor manipulation to the DFS can help me detect what I am looking for, as in keeping track of explored nodes, and finally checking if the last node visited has a connection to the initial node. Unfortunately
I could not succeed with that approach.
Another approach I tried was excluding a node, and then try to reach to its adjacent node starting from its other adjacent node. That algorithm may not give correct results according to the chosen adjacent nodes.
I'm pretty much stuck here. Can you help me think of another algorithm to tell me if the graph is a cycle graph?
Edit
I realized by the help of the comment (thank you for it n.m.):
A cycle graph consists of a single cycle and has N edges and N vertices. If there exist a cycle which contains all the nodes of the given graph, that's a Hamiltonian cycle. – n.m.
that I am actually searching for a Hamiltonian path, which I did not intend to do so:)
On a second thought, I think checking the Hamiltonian property of the graph while building it will be more efficient, which is I'm also looking for: time efficiency.
After some thinking, I thought whatever the number of nodes will be, the graph seems to be Hamiltonian due to node addition criteria. The problem is I can't be sure and I can't prove it. Does adding nodes in that fashion, i.e. adding new nodes with 2 edges which connect the added node to the existing nodes, alter the Hamiltonian property of the graph? If it doesn't alter the Hamiltonian property, how so? If it does alter, again, how so? Thanks.
EDIT #2
I, again, realized that building the graph the way I described might alter the Hamiltonian property. Consider an input given as follows:
1 3
2 3
1 5
1 3
these input says that 4th node is connected to node 1 and node 3, 5th to node 2 and node 3 . . .
4th and 7th node are connected to the same nodes, thus lowering the maximum number of nodes that can be visited exactly once, by 1. If i detect these collisions (NOT including an input such as 3 3, which is an example that you suggested since the problem states that the newly added edges are connected to 2 other nodes) and lower the maximum number of nodes, starting from N, I believe I can get the right result.
See, I do not choose the connections, they are given to me and I have to find the max. number of nodes.
I think counting the same connections while building the graph and subtracting the number of same connections from N will give the right result? Can you confirm this or is there a flaw with this algorithm?
What we have in this problem is a connected, non-directed graph with N nodes and 2N-3 edges. Consider the graph given below,
A
/ \
B _ C
( )
D
The Graph does not have a Hamiltonian Cycle. But the Graph is constructed conforming to your rules of adding nodes. So searching for a Hamiltonian Cycle may not give you the solution. More over even if it is possible Hamiltonian Cycle detection is an NP-Complete problem with O(2N) complexity. So the approach may not be ideal.
What I suggest is to use a modified version of Floyd's Cycle Finding algorithm (Also called the Tortoise and Hare Algorithm).
The modified algorithm is,
1. Initialize a List CYC_LIST to ∅.
2. Add the root node to the list CYC_LIST and set it as unvisited.
3. Call the function Floyd() twice with the unvisited node in the list CYC_LIST for each of the two edges. Mark the node as visited.
4. Add all the previously unvisited vertices traversed by the Tortoise pointer to the list CYC_LIST.
5. Repeat steps 3 and 4 until no more unvisited nodes remains in the list.
6. If the list CYC_LIST contains N nodes, then the Graph contains a Cycle involving all the nodes.
The algorithm calls Floyd's Cycle Finding Algorithm a maximum of 2N times. Floyd's Cycle Finding algorithm takes a linear time ( O(N) ). So the complexity of the modied algorithm is O(N2) which is much better than the exponential time taken by the Hamiltonian Cycle based approach.
One possible problem with this approach is that it will detect closed paths along with cycles unless stricter checking criteria are implemented.
Reply to Edit #2
Consider the Graph given below,
A------------\
/ \ \
B _ C \
|\ /| \
| D | F
\ / /
\ / /
E------------/
According to your algorithm this graph does not have a cycle containing all the nodes.
But there is a cycle in this graph containing all the nodes.
A-B-D-C-E-F-A
So I think there is some flaw with your approach. But suppose if your algorithm is correct, it is far better than my approach. Since mine takes O(n2) time, where as yours takes just O(n).
To add some clarification to this thread: finding a Hamiltonian Cycle is NP-complete, which implies that finding a longest cycle is also NP-complete because if we can find a longest cycle in any graph, we can find the Hamiltonian cycle of the subgraph induced by the vertices that lie on that cycle. (See also for example this paper regarding the longest cycle problem)
We can't use Dirac's criterion here: Dirac only tells us minimum degree >= n/2 -> Hamiltonian Cycle, that is the implication in the opposite direction of what we would need. The other way around is definitely wrong: take a cycle over n vertices, every vertex in it has exactly degree 2, no matter the size of the circle, but it has (is) an HC. What you can tell from Dirac is that no Hamiltonian Cycle -> minimum degree < n/2, which is of no use here since we don't know whether our graph has an HC or not, so we can't use the implication (nevertheless every graph we construct according to what OP described will have a vertex of degree 2, namely the last vertex added to the graph, so for arbitrary n, we have minimum degree 2).
The problem is that you can construct both graphs of arbitrary size that have an HC and graphs of arbitrary size that do not have an HC. For the first part: if the original triangle is A,B,C and the vertices added are numbered 1 to k, then connect the 1st added vertex to A and C and the k+1-th vertex to A and the k-th vertex for all k >= 1. The cycle is A,B,C,1,2,...,k,A. For the second part, connect both vertices 1 and 2 to A and B; that graph does not have an HC.
What is also important to note is that the property of having an HC can change from one vertex to the other during construction. You can both create and destroy the HC property when you add a vertex, so you would have to check for it every time you add a vertex. A simple example: take the graph after the 1st vertex was added, and add a second vertex along with edges to the same two vertices of the triangle that the 1st vertex was connected to. This constructs from a graph with an HC a graph without an HC. The other way around: add now a 3rd vertex and connect it to 1 and 2; this builds from a graph without an HC a graph with an HC.
Storing the last known HC during construction doesn't really help you because it may change completely. You could have an HC after the 20th vertex was added, then not have one for k in [21,2000], and have one again for the 2001st vertex added. Most likely the HC you had on 23 vertices will not help you a lot.
If you want to figure out how to solve this problem efficiently, you'll have to find criteria that work for all your graphs that can be checked for efficiently. Otherwise, your problem doesn't appear to me to be simpler than the Hamiltonian Cycle problem is in the general case, so you might be able to adjust one of the algorithms used for that problem to your variant of it.
Below I have added three extra nodes (3,4,5) in the original graph and it does seem like I can keep adding new nodes indefinitely while keeping the property of Hamiltonian cycle. For the below graph the cycle would be 0-1-3-5-4-2-0
1---3---5
/ \ / \ /
0---2---4
As there were no extra restrictions about how you can add a new node with two edges, I think by construction you can have a graph that holds the property of hamiltonian cycle.

Resources