DFS on each node of the graph - algorithm

Let's say in my directed graph G = (V, E), I have nodes which have certain numbers assigned to it, n[v]. I want to find the highest n[v] for each vertex, that is max(n[v]) means the maximum n[v] of the node reachable from each vertex in G.
What would we the efficient to solve this?
I am thinking on the lines of DFS on each node without backtracking and comparing the n[v] for all the 'visited' nodes and storing the maximum n[v] value for that path.
However, I am afraid that this might not be the efficient solution.

Why do you think this is inefficient?
If you run DFS on all source nodes (nodes that only out edges and no edges coming in) you are guaranteed to visit all nodes.
You can easily store the max(n[v]) on a node only after you visit all of its children.
And then if you reach a node that already has a max(n[v]) after starting from another source, you know you don't have to continue exploring that nodes children because the value is only assigned once you've visited all the children.
You're doing this in ~|E| steps (unless you have isolated nodes with no in our out edges) since you'll travel each edge exactly once.
I guess if you want to be super accurate you'll be doing this in |E| + |S| steps where |S| is the number of source nodes in your graph.
This seems to be pretty efficient to me. I'm not sure if it's possible to get a deterministic answer without checking every edge at least once.
Edit:
Also, I guess if you want to be extra complete, you may need to run a pre-step of checking all V nodes to determine which are source nodes. AND THEN running DFS on only source nodes.
So overall if would be |V| + |E| + |S| steps which is O(|V| + |E|)

Related

simple path in Graph with special nodes with max amount of special nodes

I have this problem:
Given a graph G = (V,E), that has a subset of the nodes called, R, that are "special" nodes. The amount of special nodes can very from case to case. The graph can be directed, undirected, does not have weights, and can contain cycles.
Now, I need a algorithm that can find a path from a node s to a node t, that passes through a maximum amount of the "special" nodes in R.
Im aware that this problem is np-hard, and is easily reducible from hamiltionian path, but I i've been looking for different ways of solving it without having to bruteforce all paths.
First attempt
First I tried doing some preprocessing of the graph, where every edge that goes to a "normal" node, gets a weight of 2, and every edge to a node in R gets a weight of 0.
Then I would just run dijkstra on the graph.
A counterexample this could however look like this:
In this graph, dijkstra would pick path [s,4,t] even though path [s,1,2,3,t] is a actual simple path with the maximum amount of red nodes
Second Attempt
My second attempt was a bit more convoluted. In this attempt I would run a bfs from the s-node and each R-node in the graph. I would then createa a new reachabilitygraph that could model which R-nodes that are connected to each other.
This approach would run into major issues in any graph that has cycles or is not directed, as connections between R-nodes that did not exist in the original graph would be included in the new graph.
So if anyone has any bids on any smart preprocessing steps that I could take, I would be cery happy
Your first method seems good, for example:
weigh of all edges to some node v in R = 1
weigh of rest of edges = 0
Then run Dijkstra with a cutoff = max special nodes

Find Two vertices with lowest path weight

I am trying to solve this question but got stuck.
Need some help,Thanks.
Given an undirected Connected graph G with non-negative values at edges.
Let A be a subgroup of V(G), where V(G) is the group of vertices in G.
-Find a pair of vertices (a,b) that belongs to A, such that the weight of the shortest path between them in G is minimal, in O((E+V)*log(v)))
I got the idea of using Dijkstra's algorithm in each node which will give me O(V*((E+V)logv))),which is too much.
So thought about connecting the vertices in A somehow,did'nt find any useful way.
Also tried changing the way Dijkstra's algorithm work,But it get's to hard to prove with no improvment in time complexity.
Note that if the optimal pair is (a, b), then from every node u in the optimal path, a and b are the closest two nodes in A.
I believe we should extend Dijkstra's algorithm in the following manners:
Start with all nodes in A, instead of a single source_node.
For each node, don't just remember the shortest_distance and the previous_node, but also the closest_source_node to remember which node in A gave the shortest distance.
Also, for each node, remember the second_shortest_distance, the second_closest_source_node, and previous_for_second_closest_source_node (shorter name suggestions are welcome). Make sure that second_closest_source_node is never the closest_source_node. Also, think carefully about how you update these variables, the optimal path for a node can become part of the second best path for it's neighbour.
Visit the entire graph, don't just stop at the first node whose closest_source and second_closest_source are found.
Once the entire graph is covered, search for the node whose shortest_distance + second_shortest_distance is smallest.

Is there an edge we can delete without disconnecting the graph?

Before I start, yes this is a homework.
I would not have posted here if I haven't been trying as hard as I could to solve this one for the last 14 hours and got nowhere.
The problem is as follows:
I want to check whether I can delete an edge from a connected undirected graph without disconnecting it or not in O(V) time, not just linear.
What I have reached so far:
A cycle edge can be removed without disconnecting the graph, so I simply check if the graph has a cycle.
I have two methods that could be used, one is DFS and then checking if I have back edges; the other is by counting Vs and Es and checking if |E| = |V| - 1, if that's true then the graph is a tree and there's no node we can delete without disconnecting it.
Both of the previous approaches solve the problem, but both need O(|E|+|V|), and the book says there's a faster way(that's probably a greedy approach).
Can I get any hints, please?
EDIT:
More specifically, this is my question; given a connected graph G=(V,E), can I remove some edge e in E and have the resulting graph still be connected?
Any recursive traversal of the graph, marking nodes as they're visited and short-circuiting to return true if you ever run into a node that is already marked will do the trick. This takes O(|V|) to traverse the entire graph if there is no edge that can be removed, and less time if it stops early to return true.
edit
Yes, a recusive traversal of the entire graph requires O(|V|+|E|) time, but we only traverse the entire graph if there are no cycles -- in which case |E| = |V|-1 and that only takes O(|V|) time. If there is a cycle, we'll find it after traversing at most |V| edges (and visiting at most |V|+1 nodes), which likewise takes O(|V|) time.
Also, obviously when traversing from a node (other than the first), you don't consider the edge you used to get to the node, as that would cause you to immediately see an already visited node.
You list all edges E and take an edge and mark one by one the two end vertices visited. If during traversing we find that the two vertices have been visited previously by some edges and we can remove that edge.
We have to take edges at most |V| time to see whether this condition satisfy.
Worst case may go like this, each time we take an edge it will visit atleast new vertex. Then there are |V| vertices and we have to take |V| edges for that particular edge to be found.
Best case may be the one with |V| / 2 + 1 e
Have you heard of spanning trees? A connected graph with V-1 edges.
We can remove certain edges from a connected graph G (like the ones which are creating cycle) until we get a connected tree. Notice that question is not asking you to find a spanning tree.
Question is asking if you can remove one or more edges from graph without loosing connectivity. Simply count number of edges and break when count grows beyond V-1 because the graph has scope to remove more edges and become spanning tree. It can be done in O(V) times if the graph is given in adjacency list.
From what I'm reading, DFS without repetition is considered O(|V|), so if you take edge e, and let the two vertices it connects be u and v, if you run DFS from u, ignoring e, you can surmise that e is not a bridge if v is discovered, and given that DFS without repetition is O(|V|), then this would, I guess be considered O(|V|).

Most efficient way to visit nodes of a DAG in order

I have a large (100,000+ nodes) Directed Acyclic Graph (DAG) and would like to run a "visitor" type function on each node in order, where order is defined by the arrows in the graph. i.e. all parents of a node are guaranteed to be visited before the node itself.
If two nodes do not refer to each other directly or indirectly, then I don't care which order they are visited in.
What's the most efficient algorithm to do this?
You would have to perform a topological sort on the nodes, and visit the nodes in the resulting order.
The complexity of such algorithm is O(|V|+|E|) which is quite good. You want to traverse all nodes, so if you would want a faster algorithm than that, you would have to solve it without even looking at all edges, which would be dangerous, because one single edge could havoc the order completely.
There are some answers here:
Good graph traversal algorithm
and here:
http://en.wikipedia.org/wiki/Topological_sorting
In general, after visiting a node, you should visit its related nodes, but only the nodes that are not already visited. In order to keep track of the visited nodes, you need to keep the IDs of the nodes in a set (or map), or you can mark the node as visited (somehow).
If you care about the topological order, you must first get hold of a collection of all the un-traversed links ("remaining links") to a node, sorted by the id of the referenced node (typically: map(node-ID -> link-count)). If you haven't got that, you might need to build it using an approach similar to the one above. Then, start by visiting a node whose remaining incoming link count is zero. For each link from that node, reduce the remaining link count for each related node, adding the related node to the set of nodes-to-visit (or just visiting the node) if the count reaches zero.
As mentioned in the other answers, this problem can be solved by Topological Sorting.
A very simple algorithm for that (not the most efficient):
Keep an array (or map) indegree[] where indegree[node]=number of incoming edges of node
while there is at least one node n with indegree[n]=0:
for each node n in nodes where indegree[n]>0:
visit(n)
indegree[n]=-1 # mark n as visited
for each node x adjacent to n:
indegree[x]=indegree[x]-1 # its parent has been visited, so one less edge coming into it
You can traverse a DAG in O(N) (without any topsort) by just running your dfs from every node with zero indegree, because those will be the valid "starting point". This will work because graph has no cycles, those zero indegree nodes must exist, and must traverse the whole graph.

Is there a proper algorithm to solve edge-removing problem?

There is a directed graph (not necessarily connected) of which one or more nodes are distinguished as sources. Any node accessible from any one of the sources is considered 'lit'.
Now suppose one of the edges is removed. The problem is to determine the nodes that were previously lit and are not lit anymore.
An analogy like city electricity system may be considered, I presume.
This is a "dynamic graph reachability" problem. The following paper should be useful:
A fully dynamic reachability algorithm for directed graphs with an almost linear update time. Liam Roditty, Uri Zwick. Theory of Computing, 2002.
This gives an algorithm with O(m * sqrt(n))-time updates (amortized) and O(sqrt(n))-time queries on a possibly-cyclic graph (where m is the number of edges and n the number of nodes). If the graph is acyclic, this can be improved to O(m)-time updates (amortized) and O(n/log n)-time queries.
It's always possible you could do better than this given the specific structure of your problem, or by trading space for time.
If instead of just "lit" or "unlit" you would keep a set of nodes from which a node is powered or lit, and consider a node with an empty set as "unlit" and a node with a non-empty set as "lit", then removing an edge would simply involve removing the source node from the target node's set.
EDIT: Forgot this:
And if you remove the last lit-from-node in the set, traverse the edges and remove the node you just "unlit" from their set (and possibly traverse from there too, and so on)
EDIT2 (rephrase for tafa):
Firstly: I misread the original question and thought that it stated that for each node it was already known to be lit or unlit, which as I re-read it now, was not mentioned.
However, if for each node in your network you store a set containing the nodes it was lit through, you can easily traverse the graph from the removed edge and fix up any lit/unlit references.
So for example if we have nodes A,B,C,D like this: (lame attempt at ascii art)
A -> B >- D
\-> C >-/
Then at node A you would store that it was a source (and thus lit by itself), and in both B and C you would store they were lit by A, and in D you would store that it was lit by both A and C.
Then say we remove the edge from B to D: In D we remove B from the lit-source-list, but it remains lit as it is still lit by A. Next say we remove the edge from A to C after that: A is removed from C's set, and thus C is no longer lit. We then go on to traverse the edges that originated at C, and remove C from D's set which is now also unlit. In this case we are done, but if the set was bigger, we'd just go on from D.
This algorithm will only ever visit the nodes that are directly affected by a removal or addition of an edge, and as such (apart from the extra storage needed at each node) should be close to optimal.
Is this your homework?
The simplest solution is to do a DFS (http://en.wikipedia.org/wiki/Depth-first_search) or a BFS (http://en.wikipedia.org/wiki/Breadth-first_search) on the original graph starting from the source nodes. This will get you all the original lit nodes.
Now remove the edge in question. Do again the DFS. You can the nodes which still remain lit.
Output the nodes that appear in the first set but not the second.
This is an asymptotically optimal algorithm, since you do two DFSs (or BFSs) which take O(n + m) times and space (where n = number of nodes, m = number of edges), which dominate the complexity. You need at least o(n + m) time and space to read the input, therefore the algorithm is optimal.
Now if you want to remove several edges, that would be interesting. In this case, we would be talking about dynamic data structures. Is this what you intended?
EDIT: Taking into account the comments:
not connected is not a problem, since nodes in unreachable connected components will not be reached during the search
there is a smart way to do the DFS or BFS from all nodes at once (I will describe BFS). You just have to put them all at the beginning on the stack/queue.
Pseudo code for a BFS which searches for all nodes reachable from any of the starting nodes:
Queue q = [all starting nodes]
while (q not empty)
{
x = q.pop()
forall (y neighbour of x) {
if (y was not visited) {
visited[y] = true
q.push(y)
}
}
}
Replace Queue with a Stack and you get a sort of DFS.
How big and how connected are the graphs? You could store all paths from the source nodes to all other nodes and look for nodes where all paths to that node contain one of the remove edges.
EDIT: Extend this description a bit
Do a DFS from each source node. Keep track of all paths generated to each node (as edges, not vertices, so then we only need to know the edges involved, not their order, and so we can use a bitmap). Keep a count for each node of the number of paths from source to node.
Now iterate over the paths. Remove any path that contains the removed edge(s) and decrement the counter for that node. If a node counter is decremented to zero, it was lit and now isn't.
I would keep the information of connected source nodes on the edges while building the graph.(such as if edge has connectivity to the sources S1 and S2, its source list contains S1 and S2 ) And create the Nodes with the information of input edges and output edges. When an edge is removed, update the output edges of the target node of that edge by considering the input edges of the node. And traverse thru all the target nodes of the updated edges by using DFS or BFS. (In case of a cycle graph, consider marking). While updating the graph, it is also possible to find nodes without any edge that has source connection (lit->unlit nodes). However, it might not be a good solution, if you'd like to remove multiple edges at the same time since that may cause to traverse over same edges again and again.

Resources