I have a list of nodes which belong in a graph. The graph is directed and does not contain cycles. Also, some of the nodes are marked as "end" nodes. Every node has a set of input nodes I can use.
The question is the following: How can I sort (ascending) the nodes in the list by the biggest distance to any reachable end node? Here is an example off how the graph could look like.
I have already added the calculated distance after which I can sort the nodes (grey). The end nodes have the distance 0 while C, D and G have the distance 1. However, F has the distance of 3 because the approach over D would be shorter (2).
I have made a concept of which I think, the problem would be solved. Here is some pseudo-code:
sortedTable<Node, depth> // used to store nodes and their currently calculated distance
tempTable<Node>// used to store nodes
currentDepth = 0;
- fill tempTable with end nodes
while( tempTable is not empty)
{
- create empty newTempTable<Node node>
// add tempTable to sortedTable
for (every "node" in tempTable)
{
if("node" is in sortedTable)
{
- overwrite depth in sortedTable with currentDepth
}
else
{
- add (node, currentDepth) to sortedTable
}
// get the node in the next layer
for ( every "newNode" connected to node)
{
- add newNode to newTempTable
}
- tempTable = newTempTable
}
currentDepth++;
}
This approach should work. However, the problem with this algorithm is that it basicly creates a tree from the graph based from every end node and then corrects old distance-calculations for every depth. For example: G would have the depth 1 (calculatet directly over B), then the depth 3 (calculated over A, D and F) and then depth 4 (calculated over A, C, E and F).
Do you have a better solution to this problem?
It can be done with dynamic programming.
The graph is a DAG, so first do a topological sort on the graph, let the sorted order be v1,v2,v3,...,vn.
Now, set D(v)=0 for all "end node", and from last to first (according to topological order) do:
D(v) = max { D(u) + 1, for each edge (v,u) }
It works because the graph is a DAG, and when done in reversed to the topological order, the values of all D(u) for all outgoing edges (v,u) is already known.
Example on your graph:
Topological sort (one possible):
H,G,B,F,D,E,C,A
Then, the algorithm:
init:
D(B)=D(A)=0
Go back from last to first:
D(A) - no out edges, done
D(C) = max{D(A) + 1} = max{0+1}=1
D(E) = max{D(C) + 1} = 2
D(D) = max{D(A) + 1} = 1
D(F) = max{D(E)+1, D(D)+1} = max{2+1,1+1} = 3
D(B) = 0
D(G) = max{D(B)+1,D(F)+1} = max{1,4}=4
D(H) = max{D(G) + 1} = 5
As a side note, if the graph is not a DAG, but a general graph, this is a variant of the Longest Path Problem, which is NP-Complete.
Luckily, it does have an efficient solution when our graph is a DAG.
Related
Problem
Given:
A directed graph G
A source vertex s in G and a target vertex t in G
A set S of vertices of G
I want to find a collection of paths from s to t that covers S.
Then I want to partition the collection of paths into subcollections of vertex-disjoint paths.
Under these constraints, the objective is to minimise the number of subcollections.
Example
For instance, [C1 = {p1,p2,p3}, C2= {p4,p5}, C3= {p6,p7}] is a solution if:
each p_i is a path from s to t
p1,p2,p3 have no vertices in common except s and t;
p4, p5 have no vertices in common except s and t;
p6,p7 have no vertices in common except s and t;
collectively, the 7 paths cover all vertices of S.
In that case, the number of subcollections is 3.
Question
What are some good algorithms or heuristics for this optimisation problem?
I already know min cost flow, and disjoint path algos, but they don't apply in my settings.
I tried min cost flow / node disjoint paths but one run only gives one collection at a time. I don't know how to adjust cost to cover the unexplored vertices.
Given:
A directed graph G
A source vertex s in G and a target vertex t in G
A set S of vertices of G
I want to find a collection of paths from s to t that covers S.
Use Dijkstra's algorithm to find a path from s to every vertex in S and from every point in S to t.
Connect the paths to and from each S vertex into one path from s to t via a point in S.
You now have a collection of paths that, together, cover S. Let's call it CS.
Then I want to partition the collection of paths into subcollections
of vertex-disjoint paths.
Note that if s, the source vertex, has an out degree of sOD, there can be no more than sOD paths in each vertex disjoint collection.
Construct vVDP, an empty vector of vertex disjoint path collections
LOOP P over paths in CS
SET found FALSE
LOOP cs over collections in vVDP
IF P is vertex disjoint with every path in cs
add P to cs
SET found TRUE
BREAK out of LOOP cs
IF found == false
ADD collection containing P to vVDP
Here is a C++ implementation of this algorithm
void cProblem::collect()
{
// loop over paths
for (auto &P : vpath)
{
// loop over collections
bool found = false;
for (auto &cs : vVDP)
{
//loop over paths in collection
bool disjoint;
for (auto &csPath : cs)
{
// check that P is vertex disjoint with path in collection
disjoint = true;
for (auto vc : csPath)
{
for (auto &vp : P)
{
if (vp == vc) {
disjoint = false;
break;
}
}
}
if( ! disjoint )
break;
}
if (disjoint)
{
// P is vertex disjoint from every path in collection
// add P to the collection
cs.push_back(P);
found = true;
break;
}
}
if (!found)
{
// P was NOT vertex disjoint with the paths in any collection
// start a new collection with P
std::vector<std::vector<int>> collection;
collection.push_back(P);
vVDP.push_back(collection);
}
}
}
The complete application is at https://github.com/JamesBremner/so75419067
Detailed documentation id the required input file format at
https://github.com/JamesBremner/so75419067/wiki
If you post a real example in the correct format, I will run the algorithm on it for you.
So I'm learning Graphs and DFS. I saw this question that involved it:
There is a bi-directional graph with n vertices, where each vertex is labeled from 0 to n - 1 (inclusive). The edges in the graph are represented as a 2D integer array edges, where each edges[i] = [ui, vi] denotes a bi-directional edge between vertex ui and vertex vi. Every vertex pair is connected by at most one edge, and no vertex has an edge to itself.
You want to determine if there is a valid path that exists from vertex source to vertex destination.
Given edges and the integers n, source, and destination, return true if there is a valid path from source to destination, or false otherwise.
Example 1:
Input: n = 3, edges = [[0,1],[1,2],[2,0]], source = 0, destination = 2
Output: true:
Explanation: There are two paths from vertex 0 to vertex 2:
0 → 1 → 2
0 → 2
Example 2:
Input: n = 6, edges = [[0,1],[0,2],[3,5],[5,4],[4,3]], source = 0, destination = 5
Output: false:
Explanation: There is no path from vertex 0 to vertex 5.
Here's a solution that solves the question
fun validPath(n: Int, edges: Array<IntArray>, source: Int, destination: Int): Boolean {
val graph = buildGraph(n, edges)
val seen = HashSet<Int>().also{
it.add(source)
}
return dfs(seen, source, destination, graph)
}
private fun dfs(seen: HashSet<Int>, cur: Int, end: Int, graph: List<HashSet<Int>>): Boolean{
if(cur == end){
return true
}
//checks to see if the path is valid
for (next in graph[cur]){
if(seen.add(next))
if(dfs(seen, next, end, graph))
return true
}
return false
}
private fun buildGraph(n: Int, edges: Array<IntArray>): List<HashSet<Int>>{
val graph = MutableList(n){ hashSetOf<Int>() }
for((u, v) in edges){
graph[u].add(v)
graph[v].add(u)
}
return graph
}
I would like to know what the DFS function is doing in this particular question. Is the function checking to see if that path is a valid path? or something else? Thank you for your time
I think the comment in the DFS function explains it, but you must remember the problem statement where a "valid path" is defined. It's checking if you can reach the target node from the current node.
It's basically traversing the graph in DFS fashion (always call recursively on the first neighboring node) and makes sure not to call itself for a node is has already visited.
If it reached the target node, it returns true.
0
I am working on a problem where I need to find all nodes at distance k from each other. So if k=3, then I need to find all nodes where they are connected by a path of distance 3. There are no self edges so if i has an edge pointing to s, s can't point back to i directly. I think I thought of two implementations here, both involving BFS. I noticed an edge case where BFS might not visit all edges because nodes might already be visited.
Do BFS at each node. Keep track of the "level" of each node in some array, where distance[0] is the root node, distance1 is all nodes adjacent to the root node, distance[2] are all nodes that are grandchildren of the root node and so on. Then to find all nodes at distance k we look at distance[i] and distance[i+k].
Do BFS once, using the same distance algorithm as above, but don't do BFS at each node. Reverse all of the edges and do BFS again to find any missed paths. This would have a much better time complexity than approach 1, but I am not sure if it would actually explore every edge and path (in my test cases it seemed to).
Is there a better approach to this? As an example in this graph with k = 2:
The paths would be 1 to 3, 1 to 5, 2 to 6, 2 to 5, 4 to 3, 1 to 4.
EDIT: The reversal of edges won't work, my current best bet is just do a BFS and then a DFS at each node until a depth of k is reached.
You may consider a basic adjacentry matrix M in which elements are not a 0 or 1 in order to indicate a connection but instead they hold the available paths of size k.
e.g
for 2->5 you would store M(2,5) = {1,2} (because there exists a path of length 1 and of length 2 between node 2 and 5)
let a and b two elems of M
a * b is defined as:
ab_res = {} //a set without duplicates
for all size i in a
for all size j in b
s = i+j
append(s) to ab_res
ab_res;
a + b is defined as:
ab_res = {}
for all size i in a
append(i) to ab_res
for all size j in a
append(j) to ab_res
This approach allows not to recompute paths of inferior size. It would work with cycles as well.
Below an unoptimized version to illustrate algorithm.
const pathk = 2;
let G = {
1:[2],
2:[3,4,5],
4:[6],
6:[3]
}
//init M
let M = Array(6).fill(0).map(x=>Array(6).fill(0).map(y=>new Set));
Object.keys(G).forEach(m=>{
G[m].forEach(to=>{
M[m-1][to-1] = new Set([1]);
})
});
function add(sums){
let arr = sums.flatMap(s=>[...s]);
return new Set(arr);
}
function times(a,b){
let s = new Set;
[...a].forEach(i=>{
[...b].forEach(j=>{
s.add(i+j);
})
});
return s;
}
function prod(a,b, pathk){
//the GOOD OL ugly matrix product :)
const n = a.length;
let M = Array(6).fill(0).map(x=>Array(6).fill(0).map(y=>new Set));
a.forEach((row,i)=>{
for(let bcol = 0; bcol<n; ++bcol){
let sum = [];
for(let k = 0; k<n; ++k){
sum.push( times(a[i][k], b[k][bcol]) );
}
M[i][bcol] = add(sum);
if(M[i][bcol].has(pathk)){
console.log('got it ', i+1, bcol+1);
}
}
})
return M;
}
//note that
//1. you can do exponentiation to fasten stuff
//2. you can discard elems in the set if they equal k (or more...)
//3. you may consider operating the product on an adjency list to save computation time & memory..
let Mi = M.map(r=>r.map(x=>new Set([...x])));//copy
for(let i = 1; i<=pathk; ++i){
Mi = prod(Mi,M, pathk);
}
I am having difficulties finding a way to properly classify the edges while a breadth-first search on a directed graph.
During a breadth-first or depth-first search, you can classify the edges met with 4 classes:
TREE
BACK
CROSS
FORWARD
Skiena [1] gives an implementation. If you move along an edge from v1 to v2, here is a way to return the class during a DFS in java, for reference. The parents map returns the parent vertex for the current search, and the timeOf() method, the time at which the vertex has been discovered.
if ( v1.equals( parents.get( v2 ) ) ) { return EdgeClass.TREE; }
if ( discovered.contains( v2 ) && !processed.contains( v2 ) ) { return EdgeClass.BACK; }
if ( processed.contains( v2 ) )
{
if ( timeOf( v1 ) < timeOf( v2 ) )
{
return EdgeClass.FORWARD;
}
else
{
return EdgeClass.CROSS;
}
}
return EdgeClass.UNCLASSIFIED;
My problem is that I cannot get it right for a Breadth first search on a directed graph. For instance:
The following graph - that is a loop - is ok:
A -> B
A -> C
B -> C
BFSing from A, B will be discovered, then C. The edges eAB and eAC are TREE edges, and when eBC is crossed last, B and C are processed and discovered, and this edge is properly classified as CROSS.
But a plain loop does not work:
A -> B
B -> C
C -> A
When the edge eCA is crossed last, A is processed and discovered. So this edge is incorrectly labeled as CROSS, whether it should be a BACK edge.
There is indeed no difference in the way the two cases are treated, even if the two edges have different classes.
How do you implement a proper edge classification for a BFS on a directed graph?
[1] http://www.algorist.com/
EDIT
Here an implementation derived from #redtuna answer.
I just added a check not to fetch the parent of the root.
I have JUnits tests that show it works for directed and undirected graphs, in the case of a loop, a straight line, a fork, a standard example, a single edge, etc....
#Override
public EdgeClass edgeClass( final V from, final V to )
{
if ( !discovered.contains( to ) ) { return EdgeClass.TREE; }
int toDepth = depths.get( to );
int fromDepth = depths.get( from );
V b = to;
while ( toDepth > 0 && fromDepth < toDepth )
{
b = parents.get( b );
toDepth = depths.get( b );
}
V a = from;
while ( fromDepth > 0 && toDepth < fromDepth )
{
a = parents.get( a );
fromDepth = depths.get( a );
}
if ( a.equals( b ) )
{
return EdgeClass.BACK;
}
else
{
return EdgeClass.CROSS;
}
}
How do you implement a proper edge classification for a BFS on a
directed graph?
As you already established, seeing a node for the first time creates a tree edge. The problem with BFS instead of DFS, as David Eisenstat said before me, is that back edges cannot be distinguished from cross ones just based on traversal order.
Instead, you need to do a bit of extra work to distinguish them. The key, as you'll see, is to use the definition of a cross edge.
The simplest (but memory-intensive) way is to associate every node with the set of its predecessors. This can be done trivially when you visit nodes. When finding a non-tree edge between nodes a and b, consider their predecessor sets. If one is a proper subset of the other, then you have a back edge. Otherwise, it's a cross edge. This comes directly from the definition of a cross edge: it's an edge between nodes where neither is the ancestor nor the descendant of the other on the tree.
A better way is to associate only a "depth" number with each node instead of a set. Again, this is readily done as you visit nodes. Now when you find a non-tree edge between a and b, start from the deeper of the two nodes and follow the tree edges backwards until you go back to the same depth as the other. So for example suppose a was deeper. Then you repeatedly compute a=parent(a) until depth(a)=depth(b).
If at this point a=b then you can classify the edge as a back edge because, as per the definition, one of the nodes is an ancestor of the other on the tree. Otherwise you can classify it as a cross edge because we know that neither node is an ancestor or descendant of the other.
pseudocode:
foreach edge(a,b) in BFS order:
if !b.known then:
b.known = true
b.depth = a.depth+1
edge type is TREE
continue to next edge
while (b.depth > a.depth): b=parent(b)
while (a.depth > b.depth): a=parent(a)
if a==b then:
edge type is BACK
else:
edge type is CROSS
The key property of DFS here is that, given two nodes u and v, the interval [u.discovered, u.processed] is a subinterval of [v.discovered, v.processed] if and only if u is a descendant of v. The times in BFS do not have this property; you have to do something else, e.g., compute the intervals via DFS on the tree that BFS produced. Then the classification pseudocode is 1. check for membership in the tree (tree edge) 2. check for head's interval contains tail's (back edge) 3. check for tail's interval contains head's (forward edge) 4. otherwise, declare a cross edge.
Instead of timeof(), you need an other vertex property, which contains the distance from the root. Let name that distance.
You have to processing a v vertex in the following way:
for (v0 in v.neighbours) {
if (!v0.discovered) {
v0.discovered = true;
v0.parent = v;
v0.distance = v.distance + 1;
}
}
v.processed = true;
After you processed a vertex a v vertex, you can run the following algorithm for every edge (from v1 to v2) of the v:
if (!v1.discovered) return EdgeClass.BACK;
else if (!v2.discovered) return EdgeClass.FORWARD;
else if (v1.distance == v2.distance) return EdgeClass.CROSS;
else if (v1.distance > v2.distance) return EdgeClass.BACK;
else {
if (v2.parent == v1) return EdgeClass.TREE;
else return EdgeClass.FORWARD;
}
I need help finding all the shortest paths between two nodes in an unweighted undirected graph.
I am able to find one of the shortest paths using BFS, but so far I am lost as to how I could find and print out all of them.
Any idea of the algorithm / pseudocode I could use?
As a caveat, remember that there can be exponentially many shortest paths between two nodes in a graph. Any algorithm for this will potentially take exponential time.
That said, there are a few relatively straightforward algorithms that can find all the paths. Here's two.
BFS + Reverse DFS
When running a breadth-first search over a graph, you can tag each node with its distance from the start node. The start node is at distance 0, and then, whenever a new node is discovered for the first time, its distance is one plus the distance of the node that discovered it. So begin by running a BFS over the graph, writing down the distances to each node.
Once you have this, you can find a shortest path from the source to the destination as follows. Start at the destination, which will be at some distance d from the start node. Now, look at all nodes with edges entering the destination node. A shortest path from the source to the destination must end by following an edge from a node at distance d-1 to the destination at distance d. So, starting at the destination node, walk backwards across some edge to any node you'd like at distance d-1. From there, walk to a node at distance d-2, a node at distance d-3, etc. until you're back at the start node at distance 0.
This procedure will give you one path back in reverse order, and you can flip it at the end to get the overall path.
You can then find all the paths from the source to the destination by running a depth-first search from the end node back to the start node, at each point trying all possible ways to walk backwards from the current node to a previous node whose distance is exactly one less than the current node's distance.
(I personally think this is the easiest and cleanest way to find all possible paths, but that's just my opinion.)
BFS With Multiple Parents
This next algorithm is a modification to BFS that you can use as a preprocessing step to speed up generation of all possible paths. Remember that as BFS runs, it proceeds outwards in "layers," getting a single shortest path to all nodes at distance 0, then distance 1, then distance 2, etc. The motivating idea behind BFS is that any node at distance k + 1 from the start node must be connected by an edge to some node at distance k from the start node. BFS discovers this node at distance k + 1 by finding some path of length k to a node at distance k, then extending it by some edge.
If your goal is to find all shortest paths, then you can modify BFS by extending every path to a node at distance k to all the nodes at distance k + 1 that they connect to, rather than picking a single edge. To do this, modify BFS in the following way: whenever you process an edge by adding its endpoint in the processing queue, don't immediately mark that node as being done. Instead, insert that node into the queue annotated with which edge you followed to get to it. This will potentially let you insert the same node into the queue multiple times if there are multiple nodes that link to it. When you remove a node from the queue, then you mark it as being done and never insert it into the queue again. Similarly, rather than storing a single parent pointer, you'll store multiple parent pointers, one for each node that linked into that node.
If you do this modified BFS, you will end up with a DAG where every node will either be the start node and have no outgoing edges, or will be at distance k + 1 from the start node and will have a pointer to each node of distance k that it is connected to. From there, you can reconstruct all shortest paths from some node to the start node by listing of all possible paths from your node of choice back to the start node within the DAG. This can be done recursively:
There is only one path from the start node to itself, namely the empty path.
For any other node, the paths can be found by following each outgoing edge, then recursively extending those paths to yield a path back to the start node.
This approach takes more time and space than the one listed above because many of the paths found this way will not be moving in the direction of the destination node. However, it only requires a modification to BFS, rather than a BFS followed by a reverse search.
Hope this helps!
#templatetypedef is correct, but he forgot to mention about distance check that must be done before any parent links are added to node. This means that se keep the distance from source in each of nodes and increment by one the distance for children. We must skip this increment and parent addition in case the child was already visited and has the lower distance.
public void addParent(Node n) {
// forbidding the parent it its level is equal to ours
if (n.level == level) {
return;
}
parents.add(n);
level = n.level + 1;
}
The full java implementation can be found by the following link.
http://ideone.com/UluCBb
I encountered the similar problem while solving this https://oj.leetcode.com/problems/word-ladder-ii/
The way I tried to deal with is first find the shortest distance using BFS, lets say the shortest distance is d. Now apply DFS and in DFS recursive call don't go beyond recursive level d.
However this might end up exploring all paths as mentioned by #templatetypedef.
First, find the distance-to-start of all nodes using breadth-first search.
(if there are a lot of nodes, you can use A* and stop when top of the queue has distance-to-start > distance-to-start(end-node). This will give you all nodes that belong to some shortest path)
Then just backtrack from the end-node. Anytime a node is connected to two (or more) nodes with a lower distance-to-start, you branch off into two (or more) paths.
templatetypedef your answer was very good, thank you a lot for that one(!!), but it missed out one point:
If you have a graph like this:
A-B-C-E-F
| |
D------
Now lets imagine I want this path:
A -> E.
It will expand like this:
A-> B -> D-> C -> F -> E.
The problem there is,
that you will have F as a parent of E, but
A->B->D->F-E is longer than
A->B->C->E. You will have to take of tracking the distances of parents you are so happily adding.
Step 1: Traverse the graph from the source by BFS and assign each node the minimal distance from the source
Step 2: The distance assigned to the target node is the shortest length
Step 3: From source, do a DFS search along all paths where the minimal distance is increased one by one until the target node is reached or the shortest length is reached. Print the path whenever the target node is reached.
A transformation sequence from word beginWord to word endWord using a dictionary wordList is a sequence of words beginWord -> s1 -> s2 -> ... -> sk such that:
Every adjacent pair of words differs by a single letter.
Every si for 1 <= i <= k is in wordList. Note that beginWord does not need to be in wordList.
sk == endWord
Given two words, beginWord and endWord, and a dictionary wordList, return all the shortest transformation sequences from beginWord to endWord, or an empty list if no such sequence exists. Each sequence should be returned as a list of the words [beginWord, s1, s2, ..., sk].
Example 1:
Input: beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]
Output: [["hit","hot","dot","dog","cog"],["hit","hot","lot","log","cog"]]
Explanation: There are 2 shortest transformation sequences:
"hit" -> "hot" -> "dot" -> "dog" -> "cog"
"hit" -> "hot" -> "lot" -> "log" -> "cog"
Example 2:
Input: beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log"]
Output: []
Explanation: The endWord "cog" is not in wordList, therefore there is no valid transformation sequence.
https://leetcode.com/problems/word-ladder-ii
class Solution {
public List<List<String>> findLadders(String beginWord, String endWord, List<String> wordList) {
List<List<String>> result = new ArrayList<>();
if (wordList == null) {
return result;
}
Set<String> dicts = new HashSet<>(wordList);
if (!dicts.contains(endWord)) {
return result;
}
Set<String> start = new HashSet<>();
Set<String> end = new HashSet<>();
Map<String, List<String>> map = new HashMap<>();
start.add(beginWord);
end.add(endWord);
bfs(map, start, end, dicts, false);
List<String> subList = new ArrayList<>();
subList.add(beginWord);
dfs(map, result, subList, beginWord, endWord);
return result;
}
private void bfs(Map<String, List<String>> map, Set<String> start, Set<String> end, Set<String> dicts, boolean reverse) {
// Processed all the word in start
if (start.size() == 0) {
return;
}
dicts.removeAll(start);
Set<String> tmp = new HashSet<>();
boolean finish = false;
for (String str : start) {
char[] chars = str.toCharArray();
for (int i = 0; i < chars.length; i++) {
char old = chars[i];
for (char n = 'a' ; n <='z'; n++) {
if(old == n) {
continue;
}
chars[i] = n;
String candidate = new String(chars);
if (!dicts.contains(candidate)) {
continue;
}
if (end.contains(candidate)) {
finish = true;
} else {
tmp.add(candidate);
}
String key = reverse ? candidate : str;
String value = reverse ? str : candidate;
if (! map.containsKey(key)) {
map.put(key, new ArrayList<>());
}
map.get(key).add(value);
}
// restore after processing
chars[i] = old;
}
}
if (!finish) {
// Switch the start and end if size from start is bigger;
if (tmp.size() > end.size()) {
bfs(map, end, tmp, dicts, !reverse);
} else {
bfs(map, tmp, end, dicts, reverse);
}
}
}
private void dfs (Map<String, List<String>> map,
List<List<String>> result , List<String> subList,
String beginWord, String endWord) {
if(beginWord.equals(endWord)) {
result.add(new ArrayList<>(subList));
return;
}
if (!map.containsKey(beginWord)) {
return;
}
for (String word : map.get(beginWord)) {
subList.add(word);
dfs(map, result, subList, word, endWord);
subList.remove(subList.size() - 1);
}
}
}