Dijkstra's shortest path algorithm - algorithm

Dijkstra's algorithm in CLRS, p.595 has the following code in line 7:
for each vertex v \in Adj[u]
This line picks to iterate on all neighbors of node v. Node v here is the one the algorithm is currently processing and adding to the shortest path tree.
However, among those neighbors of v, those already in set S are processed on & done with, and those
nodes in S are forming a shortest path tree T.
None of the nodes in set S can have a path-thru-v that is shorter than a path already in T.
Otherwise, that path would have been traversed till then.
So, shouldn't this line 7 be better as
for each vertex v \in Adj[u] \intersect Q // Q = V \ S
or, equally,
for each vertex v \in Adj[u]\S
?
//===========================
ADDING explanations:
once you processed (processed=set the distance and parent vector
entries of all its immediate neighbors) a node u and added it to the tree,
that node u is at shortest distance from the source. if there were an off-tree node
z so that a shorter path to u would exist thru it between the source & u, that node z would be processed before u.
//======================
ADDITION 2: lengthy comment to Javier's useful answer below:
Put all edges in the graph in an array say "EDGES"-- one edge, one array entry.
each array entry holds the edge (u,v), the edge-weight, and 2 pointers-- one to node u and one to node v.
the graph is represented still as an adjacency list.
Adj[u] is still a linked list-- however, this linked list is on an array structure.
The node values in this list, this time, is the index of EDGES corresponding to that edge.
So, for instance, Node u has 2 links incident to it:
(u,x) & (u,y). Edge (u,x) is sitting at the 23rd cell of EDGES and (u,y) at 5th.
Then, Adj[u] is a linked list of length 2, the nodes in this list are 23 and 5. Say, Adj[u][0].edgesIndex=23 and Adj[u][1].edgesIndex=5. Here, Adj[x][i].edgesIndex=23 for some i in the linked list at Adj[x] as well. (Adj[j][i], being a node in a linked list, further hast the "next" and "prev" fields on it.)
And, EDGES[23] has one reference to the corresponding entry of (u,x) on Adj[u], and another to that of Adj[x]. I leave line 7 as is, but this time, after i process an edge (u,v) in that loop, (i've found out about this edge from Adj[u]), i remove that edge from the linked list of Adj[u], from there i go to the corresponding EDGES entry, which has the reference to the corresponding Adj[x][i] entry. i remove them all-- EDGES[23], Adj[u][0] and Adj[x][i] (whatever i is there.) With all arrays-structures, i can process all these in constant time for each edge.
Still the adjacency list representation, can trace the location of (v,u) from (u,v) and remove them at constant time, and now processing only on the edges in that intersection i'm looking for in asymptotically the same amount of memory used and with more time efficiency.
//====================
ADDITION 3:
Correcting one thing in ADDITION 2 above:
what i wrote in that addition may take more-- not less time than the algorithm without it:
removing the links in the linked lists at Adj[u] and Adj[x], and the corresponding
EDGES entry, the direct-memory look-ups during all these isn't much likely
to take less CPU cycles than that of relaxing the edges in the algorithm as is.
It still checks every edge (u,v) exactly once and not twice--
once for (u,v) and once for (v,u), and clearly in the same asymptotic time as the algorithm without it. But for little gain in the absolute
processing time and with more cost on memory used.
Another alternative is:
adding a line of something like
if (v \in S) then continue;
as the first of the for loop. this can be implemented by maintaining S as
an array of S[|V|] of boolean and setting its values accordingly as each vertex is
added to set S-- which is basically what javier is saying in his ans.

Intersecting Adj[u] with Q is correct, however it's not a good idea because the in the end, you'll need to iterate over all elements of Adj[i]. I don't think there's a way to workaround that.
It would only work if you can find a way to intersect those two sets VERY efficiently, i.e., anything better than O(n).
There's a nice enhancement that you implement is to mark all the nodes that are settled, then if the node v is settled, you can ignore the rest of the inner cycle.

Related

How to find which vertex has out-degree equal to v-1 in a simple DAG with only adjacency matrix in O(V)?

I have a simple directed graph, with no anti-parallel edges. I need to find an algorithm to determine if this graph contains a vertex with out-deg=|V|-1 and in-deg=0.
The input of this algorithm can only be an adjacency-matrix of this graph. And using this adjacency-matrix, we need to do it in O(|V|).
Thank you for your help.
Since there can be at most one such node, this can be done by iteratively eliminating candidates.
First, put all nodes onto a stack. We will use this stack to keep our candidates.
As long as we have at least two nodes p and q on the stack, check if the edge (p, q) exists. If it exists, then q cannot have in-degree 0 and we can remove it from the stack. If it does not exist, then p cannot have out-degree |V|-1, so we can remove it from the stack. Hence, after each check, we remove exactly one candidate, which allows us to arrive at a single candidate after O(|V|) checks.
Now we only need to check this node for the given in- and out-degree by checking the corresponding row and column in the adjacency matrix, which can also be done in O(|V|).

Designing an Algorithm to find the length of a simple cycle in a d-regular graph

I understand the question in general but don't know how to design and analyze the algorithm in the question. I was thinking of applying some sort of graph search algorithm like depth-first / breadth-first search.
UPDATE: This is what I have tried, starting from any Node of the graph (call it N), visit each of that node's d neighbors. Now, the last neighbor we just visited of N (call it L) visit any other neighbor of L that is not N ?
Others have already hinted on a possible solution in comments, let's elaborate. When d<=1, the solutions are immediate (and depend on your exact definition of cycle), so I'll assume d>1.
One such algorithm would be:
Build a path starting in any vertex V. Until the path has length d, don't allow vertices you've already visited.
Once the path is d vertices long, keep adding vertices to the path, but now only allow vertices different from the last d vertices of the path.
When you add a vertex that's already been used in the path, stop. You create the resulting cycle from a segment of the path starting and ending in that vertex.
In both (1) and (2), the existence of such a vertex is guaranteed by the fact that G is d-regular. When searching for the vertex to add, we only exclude the last d vertices, namely the last vertex (U) and its d-1 predecessors. U has d neighbors, so at least one of them has to be available.
The algorithm will stop, because of the condition (3) and the fact that G is finite.
It makes sense to prefer already visited vertices in (2), but it doesn't change the worst-case complexity.
This gives us the worst-case complexity of n*d - for we may have to visit once every vertex and check all of its edges.

Path from s to e in a weighted DAG graph with limitations

Consider a directed graph with n nodes and m edges. Each edge is weighted. There is a start node s and an end node e. We want to find the path from s to e that has the maximum number of nodes such that:
the total distance is less than some constant d
starting from s, each node in the path is closer than the previous one to the node e. (as in, when you traverse the path you are getting closer to your destination e. in terms of the edge weight of the remaining path.)
We can assume there are no cycles in the graph. There are no negative weights. Does an efficient algorithm already exist for this problem? Is there a name for this problem?
Whatever you end up doing, do a BFS/DFS starting from s first to see if e can even be reached; this only takes you O(n+m) so it won't add to the complexity of the problem (since you need to look at all vertices and edges anyway). Also, delete all edges with weight 0 before you do anything else since those never fulfill your second criterion.
EDIT: I figured out an algorithm; it's polynomial, depending on the size of your graphs it may still not be sufficiently efficient though. See the edit further down.
Now for some complexity. The first thing to think about here is an upper bound on how many paths we can actually have, so depending on the choice of d and the weights of the edges, we also have an upper bound on the complexity of any potential algorithm.
How many edges can there be in a DAG? The answer is n(n-1)/2, which is a tight bound: take n vertices, order them from 1 to n; for two vertices i and j, add an edge i->j to the graph iff i<j. This sums to a total of n(n-1)/2, since this way, for every pair of vertices, there is exactly one directed edge between them, meaning we have as many edges in the graph as we would have in a complete undirected graph over n vertices.
How many paths can there be from one vertex to another in the graph described above? The answer is 2n-2. Proof by induction:
Take the graph over 2 vertices as described above; there is 1 = 20 = 22-2 path from vertex 1 to vertex 2: (1->2).
Induction step: assuming there are 2n-2 paths from the vertex with number 1 of an n vertex graph as described above to the vertex with number n, increment the number of each vertex and add a new vertex 1 along with the required n edges. It has its own edge to the vertex now labeled n+1. Additionally, it has 2i-2 paths to that vertex for every i in [2;n] (it has all the paths the other vertices have to the vertex n+1 collectively, each "prefixed" with the edge 1->i). This gives us 1 + Σnk=2 (2k-2) = 1 + Σn-2k=0 (2k-2) = 1 + (2n-1 - 1) = 2n-1 = 2(n+1)-2.
So we see that there are DAGs that have 2n-2 distinct paths between some pairs of their vertices; this is a bit of a bleak outlook, since depending on weights and your choice of d, you may have to consider them all. This in itself doesn't mean we can't choose some form of optimum (which is what you're looking for) efficiently though.
EDIT: Ok so here goes what I would do:
Delete all edges with weight 0 (and smaller, but you ruled that out), since they can never fulfill your second criterion.
Do a topological sort of the graph; in the following, let's only consider the part of the topological sorting of the graph from s to e, let's call that the integer interval [s;e]. Delete everything from the graph that isn't strictly in that interval, meaning all vertices outside of it along with the incident edges. During the topSort, you'll also be able to see whether there is a
path from s to e, so you'll know whether there are any paths s-...->e. Complexity of this part is O(n+m).
Now the actual algorithm:
traverse the vertices of [s;e] in the order imposed by the topological
sorting
for every vertex v, store a two-dimensional array of information; let's call it
prev[][] since it's gonna store information about the predecessors
of a node on the paths leading towards it
in prev[i][j], store how long the total path of length (counted in
vertices) i is as a sum of the edge weights, if j is the predecessor of the
current vertex on that path. For example, pres+1[1][s] would have
the weight of the edge s->s+1 in it, while all other entries in pres+1
would be 0/undefined.
when calculating the array for a new vertex v, all we have to do is check
its incoming edges and iterate over the arrays for the start vertices of those
edges. For example, let's say vertex v has an incoming edge from vertex w,
having weight c. Consider what the entry prev[i][w] should be.
We have an edge w->v, so we need to set prev[i][w] in v to
min(prew[i-1][k] for all k, but ignore entries with 0) + c (notice the subscript of the array!); we effectively take the cost of a
path of length i - 1 that leads to w, and add the cost of the edge w->v.
Why the minimum? The vertex w can have many predecessors for paths of length
i - 1; however, we want to stay below a cost limit, which greedy minimization
at each vertex will do for us. We will need to do this for all i in [1;s-v].
While calculating the array for a vertex, do not set entries that would give you
a path with cost above d; since all edges have positive weights, we can only get
more costly paths with each edge, so just ignore those.
Once you reached e and finished calculating pree, you're done with this
part of the algorithm.
Iterate over pree, starting with pree[e-s]; since we have no cycles, all
paths are simple paths and therefore the longest path from s to e can have e-s edges. Find the largest
i such that pree[i] has a non-zero (meaning it is defined) entry; if non exists, there is no path fitting your criteria. You can reconstruct
any existing path using the arrays of the other vertices.
Now that gives you a space complexity of O(n^3) and a time complexity of O(n²m) - the arrays have O(n²) entries, we have to iterate over O(m) arrays, one array for each edge - but I think it's very obvious where the wasteful use of data structures here can be optimized using hashing structures and other things than arrays. Or you could just use a one-dimensional array and only store the current minimum instead of recomputing it every time (you'll have to encapsulate the sum of edge weights of the path together with the predecessor vertex though since you need to know the predecessor to reconstruct the path), which would change the size of the arrays from n² to n since you now only need one entry per number-of-nodes-on-path-to-vertex, bringing down the space complexity of the algorithm to O(n²) and the time complexity to O(nm). You can also try and do some form of topological sort that gets rid of the vertices from which you can't reach e, because those can be safely ignored as well.

Directed graph connectivity

Given a directed graph G, what is the best way to go about finding a vertex v such that there is a path from v to every other vertex in G?
This algorithm should run in linear time. Is there an existing algorithm that solves this? If not, I'd appreciate some insight into how this can be solved in linear time (I can only think of solutions that would certainly not take linear time).
Make a list L of all vertices.
Choose one; call it V. From V, walk the graph, removing points from the list as you go, and keeping a stack of unvisited edges. When you find a loop (some vertex you visit is not on the list), pop one of the edges from the stack and proceed.
If the stack is empty, and L is not empty, then choose a new vertex from L, call it V, and proceed as before.
When L is finally empty, the V you last chose is an answer.
This can be done in linear time in the number of edges.
Find the strongly connected components.
Condense each of the components into a single node.
Do a topological sort on the condensed graph, The node with the highest rank will have a path to each of the other nodes (if the graph is connected at all).
I think I've got a correct answer.
Get the SCC.
Condense each of the components into a single node.
Check whether every pair of adjacent nodes is reachable.
This is a sufficient and necessary condition.

Is there a proper algorithm to solve edge-removing problem?

There is a directed graph (not necessarily connected) of which one or more nodes are distinguished as sources. Any node accessible from any one of the sources is considered 'lit'.
Now suppose one of the edges is removed. The problem is to determine the nodes that were previously lit and are not lit anymore.
An analogy like city electricity system may be considered, I presume.
This is a "dynamic graph reachability" problem. The following paper should be useful:
A fully dynamic reachability algorithm for directed graphs with an almost linear update time. Liam Roditty, Uri Zwick. Theory of Computing, 2002.
This gives an algorithm with O(m * sqrt(n))-time updates (amortized) and O(sqrt(n))-time queries on a possibly-cyclic graph (where m is the number of edges and n the number of nodes). If the graph is acyclic, this can be improved to O(m)-time updates (amortized) and O(n/log n)-time queries.
It's always possible you could do better than this given the specific structure of your problem, or by trading space for time.
If instead of just "lit" or "unlit" you would keep a set of nodes from which a node is powered or lit, and consider a node with an empty set as "unlit" and a node with a non-empty set as "lit", then removing an edge would simply involve removing the source node from the target node's set.
EDIT: Forgot this:
And if you remove the last lit-from-node in the set, traverse the edges and remove the node you just "unlit" from their set (and possibly traverse from there too, and so on)
EDIT2 (rephrase for tafa):
Firstly: I misread the original question and thought that it stated that for each node it was already known to be lit or unlit, which as I re-read it now, was not mentioned.
However, if for each node in your network you store a set containing the nodes it was lit through, you can easily traverse the graph from the removed edge and fix up any lit/unlit references.
So for example if we have nodes A,B,C,D like this: (lame attempt at ascii art)
A -> B >- D
\-> C >-/
Then at node A you would store that it was a source (and thus lit by itself), and in both B and C you would store they were lit by A, and in D you would store that it was lit by both A and C.
Then say we remove the edge from B to D: In D we remove B from the lit-source-list, but it remains lit as it is still lit by A. Next say we remove the edge from A to C after that: A is removed from C's set, and thus C is no longer lit. We then go on to traverse the edges that originated at C, and remove C from D's set which is now also unlit. In this case we are done, but if the set was bigger, we'd just go on from D.
This algorithm will only ever visit the nodes that are directly affected by a removal or addition of an edge, and as such (apart from the extra storage needed at each node) should be close to optimal.
Is this your homework?
The simplest solution is to do a DFS (http://en.wikipedia.org/wiki/Depth-first_search) or a BFS (http://en.wikipedia.org/wiki/Breadth-first_search) on the original graph starting from the source nodes. This will get you all the original lit nodes.
Now remove the edge in question. Do again the DFS. You can the nodes which still remain lit.
Output the nodes that appear in the first set but not the second.
This is an asymptotically optimal algorithm, since you do two DFSs (or BFSs) which take O(n + m) times and space (where n = number of nodes, m = number of edges), which dominate the complexity. You need at least o(n + m) time and space to read the input, therefore the algorithm is optimal.
Now if you want to remove several edges, that would be interesting. In this case, we would be talking about dynamic data structures. Is this what you intended?
EDIT: Taking into account the comments:
not connected is not a problem, since nodes in unreachable connected components will not be reached during the search
there is a smart way to do the DFS or BFS from all nodes at once (I will describe BFS). You just have to put them all at the beginning on the stack/queue.
Pseudo code for a BFS which searches for all nodes reachable from any of the starting nodes:
Queue q = [all starting nodes]
while (q not empty)
{
x = q.pop()
forall (y neighbour of x) {
if (y was not visited) {
visited[y] = true
q.push(y)
}
}
}
Replace Queue with a Stack and you get a sort of DFS.
How big and how connected are the graphs? You could store all paths from the source nodes to all other nodes and look for nodes where all paths to that node contain one of the remove edges.
EDIT: Extend this description a bit
Do a DFS from each source node. Keep track of all paths generated to each node (as edges, not vertices, so then we only need to know the edges involved, not their order, and so we can use a bitmap). Keep a count for each node of the number of paths from source to node.
Now iterate over the paths. Remove any path that contains the removed edge(s) and decrement the counter for that node. If a node counter is decremented to zero, it was lit and now isn't.
I would keep the information of connected source nodes on the edges while building the graph.(such as if edge has connectivity to the sources S1 and S2, its source list contains S1 and S2 ) And create the Nodes with the information of input edges and output edges. When an edge is removed, update the output edges of the target node of that edge by considering the input edges of the node. And traverse thru all the target nodes of the updated edges by using DFS or BFS. (In case of a cycle graph, consider marking). While updating the graph, it is also possible to find nodes without any edge that has source connection (lit->unlit nodes). However, it might not be a good solution, if you'd like to remove multiple edges at the same time since that may cause to traverse over same edges again and again.

Resources