What does boost::out_edges( v, g ) in Boost.Graph do? - boost

I am not able to comprehend the documentation for this function, I have seen several times the following
tie (ei,ei_end) = out_edges(*(vi+a),g);
**g**<-graph
**vi**<-beginning vertex of graph
**a**<- a node
**ei and ei_end** <- edge iterators
What does the function return,and what does it do,when could I use?
Can I find all edges from a node for example?

Provides iterators to iterate over the out-going edges of node u from graph g, e.g.:
typename graph_traits < Graph >::out_edge_iterator ei, ei_end;
for (boost::tie(ei, ei_end) = out_edges(u, g); ei != ei_end; ++ei) {
auto source = boost::source ( *ei, g );
auto target = boost::target ( *ei, g );
std::cout << "There is an edge from " << source << " to " << target << std::endl;
}
where Graph is your type definition of the graph an g is an instance of that. However, out_edges is only applicable for graphs with directed edges. The opposite of out_edges is in_edges that provides you iterators to compute in-coming edges of a node.
In an undirected graph both out_edges and in_edges will return all the edges connecting to the node in question.
However, more information can be easily found on http://www.boost.org/doc/libs/1_55_0/libs/graph/doc/graph_concepts.html or just in the Boost.Graph examples/tests.

As explained above, for a directed graph, out_edges accepts a "vertex_descriptor and the graph(adjacency list) to be examined" and returns "all the edges that emanate (directed from) the given vertex_descriptor", by means of an iterator-range.
As described in https://www.boost.org/doc/libs/1_69_0/libs/graph/doc/adjacency_list.html
std::pair<out_edge_iterator, out_edge_iterator>
out_edges(vertex_descriptor u, const adjacency_list& g)
Returns an iterator-range providing access to the out-edges of vertex
u in graph g. If the graph is undirected, this iterator-range provides
access to all edges incident on vertex u. For both directed and
undirected graphs, for an out-edge e, source(e, g) == u and target(e,
g) == v where v is a vertex adjacent to u.
In short, to answer some of your questions,
Yes, you can use it to find all edges from a node.
For undirected graphs, the behavior is as explained in the link above, it returns all the edges incident on the vertex (all edges connected to it)

Related

minimum collection of vertice disjoint path that covers a given vertice set

Problem
Given:
A directed graph G
A source vertex s in G and a target vertex t in G
A set S of vertices of G
I want to find a collection of paths from s to t that covers S.
Then I want to partition the collection of paths into subcollections of vertex-disjoint paths.
Under these constraints, the objective is to minimise the number of subcollections.
Example
For instance, [C1 = {p1,p2,p3}, C2= {p4,p5}, C3= {p6,p7}] is a solution if:
each p_i is a path from s to t
p1,p2,p3 have no vertices in common except s and t;
p4, p5 have no vertices in common except s and t;
p6,p7 have no vertices in common except s and t;
collectively, the 7 paths cover all vertices of S.
In that case, the number of subcollections is 3.
Question
What are some good algorithms or heuristics for this optimisation problem?
I already know min cost flow, and disjoint path algos, but they don't apply in my settings.
I tried min cost flow / node disjoint paths but one run only gives one collection at a time. I don't know how to adjust cost to cover the unexplored vertices.
Given:
A directed graph G
A source vertex s in G and a target vertex t in G
A set S of vertices of G
I want to find a collection of paths from s to t that covers S.
Use Dijkstra's algorithm to find a path from s to every vertex in S and from every point in S to t.
Connect the paths to and from each S vertex into one path from s to t via a point in S.
You now have a collection of paths that, together, cover S. Let's call it CS.
Then I want to partition the collection of paths into subcollections
of vertex-disjoint paths.
Note that if s, the source vertex, has an out degree of sOD, there can be no more than sOD paths in each vertex disjoint collection.
Construct vVDP, an empty vector of vertex disjoint path collections
LOOP P over paths in CS
SET found FALSE
LOOP cs over collections in vVDP
IF P is vertex disjoint with every path in cs
add P to cs
SET found TRUE
BREAK out of LOOP cs
IF found == false
ADD collection containing P to vVDP
Here is a C++ implementation of this algorithm
void cProblem::collect()
{
// loop over paths
for (auto &P : vpath)
{
// loop over collections
bool found = false;
for (auto &cs : vVDP)
{
//loop over paths in collection
bool disjoint;
for (auto &csPath : cs)
{
// check that P is vertex disjoint with path in collection
disjoint = true;
for (auto vc : csPath)
{
for (auto &vp : P)
{
if (vp == vc) {
disjoint = false;
break;
}
}
}
if( ! disjoint )
break;
}
if (disjoint)
{
// P is vertex disjoint from every path in collection
// add P to the collection
cs.push_back(P);
found = true;
break;
}
}
if (!found)
{
// P was NOT vertex disjoint with the paths in any collection
// start a new collection with P
std::vector<std::vector<int>> collection;
collection.push_back(P);
vVDP.push_back(collection);
}
}
}
The complete application is at https://github.com/JamesBremner/so75419067
Detailed documentation id the required input file format at
https://github.com/JamesBremner/so75419067/wiki
If you post a real example in the correct format, I will run the algorithm on it for you.

Graph Traversal using DFS

I am learning graph traversal from The Algorithm Design Manual by Steven S. Skiena. In his book, he has provided the code for traversing the graph using dfs. Below is the code.
dfs(graph *g, int v)
{
edgenode *p;
int y;
if (finished) return;
discovered[v] = TRUE;
time = time + 1;
entry_time[v] = time;
process_vertex_early(v);
p = g->edges[v];
while (p != NULL) {
/* temporary pointer */
/* successor vertex */
/* allow for search termination */
y = p->y;
if (discovered[y] == FALSE) {
parent[y] = v;
process_edge(v,y);
dfs(g,y);
}
else if ((!processed[y]) || (g->directed))
process_edge(v,y);
}
if (finished) return;
p = p->next;
}
process_vertex_late(v);
time = time + 1;
exit_time[v] = time;
processed[v] = TRUE;
}
In a undirected graph, it looks like below code is processing the edge twice (calling the method process_edge(v,y). One while traversing the vertex v and another at processing the vertex y) . So I have added the condition parent[v]!=y in else if ((!processed[y]) || (g->directed)). It processes the edge only once. However, I am not sure how to modify this code to work with the parallel edge and self-loop edge. The code should process the parallel edge and self-loop.
Short Answer:
Substitute your (parent[v]!=y) for (!processed[y]) instead of adding it to the condition.
Detailed Answer:
In my opinion there is a mistake in the implementation written in the book, which you discovered and fixed (except for parallel edges. More on that below). The implementation is supposed to be correct for both directed and undeirected graphs, with the distinction between them recorded in the g->directed boolean property.
In the book, just before the implementation the author writes:
The other important property of a depth-first search is that it partitions the
edges of an undirected graph into exactly two classes: tree edges and back edges. The
tree edges discover new vertices, and are those encoded in the parent relation. Back
edges are those whose other endpoint is an ancestor of the vertex being expanded,
so they point back into the tree.
So the condition (!processed[y]) is supposed to handle undirected graphs (as the condition (g->directed) is to handle directed graphs) by allowing the algorithm to process the edges that are back-edges and preventing it from re-process those that are tree edges (in the opposite direction). As you noticed, though, the tree-edges are treated as back-edges when read through the child with this condition so you should just replace this condition with your suggested (parent[v]!=y).
The condition (!processed[y]) will ALWAYS be true for an undirected graph when the algorithm reads it as long as there are no parallel edges (further details why this is true - *). If there are parallel edges - those parallel edges that are read after the first "copy" of them will yield false and the edge will not be processed, when it should be. Your suggested condition, however, will distinguish between tree-edges and the rest (back-edges, parallel edges and self-loops) and allow the algorithm to process only those that are not tree-edges in the opposite direction.
To refer to self-edges, they should be fine both with the new and old conditions: they are edges with y==v. Getting to them, y is discovered (because v is discovered before going through its edges), not processed (v is processed only as the last line - after going through its edges) and it is not v's parent (v is not its own parent).
*Going through v's edges, the algorithm reads this condition for y that has been discovered (so it doesn't go into the first conditional block). As quoted above (in the book there is a semi-proof for that as well which I will include at the end of this footnote), p is either a tree-edge or a back-edge. As y is discovered, it cannot be a tree-edge from v to y. It can be a back edge to an ancestor which means the call is in a recursion call that started processing this ancestor at some point, and so the ancestor's call has yet to reach the final line, marking it as processed (so it is still marked as not processed) and it can be a tree-edge from y to v, in which case the same situation holds - and y is still marked as not processed.
The semi-proof for every edge being a tree-edge or a back-edge:
Why can’t an edge go to a brother or cousin node instead of an ancestor?
All nodes reachable from a given vertex v are expanded before we finish with the
traversal from v, so such topologies are impossible for undirected graphs.
You are correct.
Quoting the book's (2nd edition) errata:
(*) Page 171, line -2 -- The dfs code has a bug, where each tree edge
is processed twice in undirected graphs. The test needs to be
strengthed to be:
else if (((!processed[y]) && (parent[v]!=y)) || (g->directed))
As for cycles - see here

sort graph by distance to end nodes

I have a list of nodes which belong in a graph. The graph is directed and does not contain cycles. Also, some of the nodes are marked as "end" nodes. Every node has a set of input nodes I can use.
The question is the following: How can I sort (ascending) the nodes in the list by the biggest distance to any reachable end node? Here is an example off how the graph could look like.
I have already added the calculated distance after which I can sort the nodes (grey). The end nodes have the distance 0 while C, D and G have the distance 1. However, F has the distance of 3 because the approach over D would be shorter (2).
I have made a concept of which I think, the problem would be solved. Here is some pseudo-code:
sortedTable<Node, depth> // used to store nodes and their currently calculated distance
tempTable<Node>// used to store nodes
currentDepth = 0;
- fill tempTable with end nodes
while( tempTable is not empty)
{
- create empty newTempTable<Node node>
// add tempTable to sortedTable
for (every "node" in tempTable)
{
if("node" is in sortedTable)
{
- overwrite depth in sortedTable with currentDepth
}
else
{
- add (node, currentDepth) to sortedTable
}
// get the node in the next layer
for ( every "newNode" connected to node)
{
- add newNode to newTempTable
}
- tempTable = newTempTable
}
currentDepth++;
}
This approach should work. However, the problem with this algorithm is that it basicly creates a tree from the graph based from every end node and then corrects old distance-calculations for every depth. For example: G would have the depth 1 (calculatet directly over B), then the depth 3 (calculated over A, D and F) and then depth 4 (calculated over A, C, E and F).
Do you have a better solution to this problem?
It can be done with dynamic programming.
The graph is a DAG, so first do a topological sort on the graph, let the sorted order be v1,v2,v3,...,vn.
Now, set D(v)=0 for all "end node", and from last to first (according to topological order) do:
D(v) = max { D(u) + 1, for each edge (v,u) }
It works because the graph is a DAG, and when done in reversed to the topological order, the values of all D(u) for all outgoing edges (v,u) is already known.
Example on your graph:
Topological sort (one possible):
H,G,B,F,D,E,C,A
Then, the algorithm:
init:
D(B)=D(A)=0
Go back from last to first:
D(A) - no out edges, done
D(C) = max{D(A) + 1} = max{0+1}=1
D(E) = max{D(C) + 1} = 2
D(D) = max{D(A) + 1} = 1
D(F) = max{D(E)+1, D(D)+1} = max{2+1,1+1} = 3
D(B) = 0
D(G) = max{D(B)+1,D(F)+1} = max{1,4}=4
D(H) = max{D(G) + 1} = 5
As a side note, if the graph is not a DAG, but a general graph, this is a variant of the Longest Path Problem, which is NP-Complete.
Luckily, it does have an efficient solution when our graph is a DAG.

Edge classification during Breadth-first search on a directed graph

I am having difficulties finding a way to properly classify the edges while a breadth-first search on a directed graph.
During a breadth-first or depth-first search, you can classify the edges met with 4 classes:
TREE
BACK
CROSS
FORWARD
Skiena [1] gives an implementation. If you move along an edge from v1 to v2, here is a way to return the class during a DFS in java, for reference. The parents map returns the parent vertex for the current search, and the timeOf() method, the time at which the vertex has been discovered.
if ( v1.equals( parents.get( v2 ) ) ) { return EdgeClass.TREE; }
if ( discovered.contains( v2 ) && !processed.contains( v2 ) ) { return EdgeClass.BACK; }
if ( processed.contains( v2 ) )
{
if ( timeOf( v1 ) < timeOf( v2 ) )
{
return EdgeClass.FORWARD;
}
else
{
return EdgeClass.CROSS;
}
}
return EdgeClass.UNCLASSIFIED;
My problem is that I cannot get it right for a Breadth first search on a directed graph. For instance:
The following graph - that is a loop - is ok:
A -> B
A -> C
B -> C
BFSing from A, B will be discovered, then C. The edges eAB and eAC are TREE edges, and when eBC is crossed last, B and C are processed and discovered, and this edge is properly classified as CROSS.
But a plain loop does not work:
A -> B
B -> C
C -> A
When the edge eCA is crossed last, A is processed and discovered. So this edge is incorrectly labeled as CROSS, whether it should be a BACK edge.
There is indeed no difference in the way the two cases are treated, even if the two edges have different classes.
How do you implement a proper edge classification for a BFS on a directed graph?
[1] http://www.algorist.com/
EDIT
Here an implementation derived from #redtuna answer.
I just added a check not to fetch the parent of the root.
I have JUnits tests that show it works for directed and undirected graphs, in the case of a loop, a straight line, a fork, a standard example, a single edge, etc....
#Override
public EdgeClass edgeClass( final V from, final V to )
{
if ( !discovered.contains( to ) ) { return EdgeClass.TREE; }
int toDepth = depths.get( to );
int fromDepth = depths.get( from );
V b = to;
while ( toDepth > 0 && fromDepth < toDepth )
{
b = parents.get( b );
toDepth = depths.get( b );
}
V a = from;
while ( fromDepth > 0 && toDepth < fromDepth )
{
a = parents.get( a );
fromDepth = depths.get( a );
}
if ( a.equals( b ) )
{
return EdgeClass.BACK;
}
else
{
return EdgeClass.CROSS;
}
}
How do you implement a proper edge classification for a BFS on a
directed graph?
As you already established, seeing a node for the first time creates a tree edge. The problem with BFS instead of DFS, as David Eisenstat said before me, is that back edges cannot be distinguished from cross ones just based on traversal order.
Instead, you need to do a bit of extra work to distinguish them. The key, as you'll see, is to use the definition of a cross edge.
The simplest (but memory-intensive) way is to associate every node with the set of its predecessors. This can be done trivially when you visit nodes. When finding a non-tree edge between nodes a and b, consider their predecessor sets. If one is a proper subset of the other, then you have a back edge. Otherwise, it's a cross edge. This comes directly from the definition of a cross edge: it's an edge between nodes where neither is the ancestor nor the descendant of the other on the tree.
A better way is to associate only a "depth" number with each node instead of a set. Again, this is readily done as you visit nodes. Now when you find a non-tree edge between a and b, start from the deeper of the two nodes and follow the tree edges backwards until you go back to the same depth as the other. So for example suppose a was deeper. Then you repeatedly compute a=parent(a) until depth(a)=depth(b).
If at this point a=b then you can classify the edge as a back edge because, as per the definition, one of the nodes is an ancestor of the other on the tree. Otherwise you can classify it as a cross edge because we know that neither node is an ancestor or descendant of the other.
pseudocode:
foreach edge(a,b) in BFS order:
if !b.known then:
b.known = true
b.depth = a.depth+1
edge type is TREE
continue to next edge
while (b.depth > a.depth): b=parent(b)
while (a.depth > b.depth): a=parent(a)
if a==b then:
edge type is BACK
else:
edge type is CROSS
The key property of DFS here is that, given two nodes u and v, the interval [u.discovered, u.processed] is a subinterval of [v.discovered, v.processed] if and only if u is a descendant of v. The times in BFS do not have this property; you have to do something else, e.g., compute the intervals via DFS on the tree that BFS produced. Then the classification pseudocode is 1. check for membership in the tree (tree edge) 2. check for head's interval contains tail's (back edge) 3. check for tail's interval contains head's (forward edge) 4. otherwise, declare a cross edge.
Instead of timeof(), you need an other vertex property, which contains the distance from the root. Let name that distance.
You have to processing a v vertex in the following way:
for (v0 in v.neighbours) {
if (!v0.discovered) {
v0.discovered = true;
v0.parent = v;
v0.distance = v.distance + 1;
}
}
v.processed = true;
After you processed a vertex a v vertex, you can run the following algorithm for every edge (from v1 to v2) of the v:
if (!v1.discovered) return EdgeClass.BACK;
else if (!v2.discovered) return EdgeClass.FORWARD;
else if (v1.distance == v2.distance) return EdgeClass.CROSS;
else if (v1.distance > v2.distance) return EdgeClass.BACK;
else {
if (v2.parent == v1) return EdgeClass.TREE;
else return EdgeClass.FORWARD;
}

Graph data structure terms

What is the difference between the terms edge and path in graph data structure?
An edge is something that connects two nodes. A path is a series of edges in sequence that defines a "path" from node A to node B.
http://en.wikipedia.org/wiki/Graph_(data_structure)
Edge: connects node one node to another. So there no nodes present between node A and B.
eg. A<-->B or A-->B or A<---B.
Path: Connects 1 or more nodes to each other. So path contains 1 or more edges.
eg. 1.) A---B---C : here path is ABC
2.)
A
/ \
B C
/
D
Here different paths are A-B-C and A-C.
Different edges are: A-B, B-C, A-C.
I hope this clears your doubt
Edge is a connection between two vertices of the graph.
Consider the graph a b
6---4----5
| | \ e
c | d| 1
| | / f
3----2
g
a,b,c,d,e represents the edges of the graphs where as a path can be path from a to g that can be a,b,d,g or a,c,g.
Edge is a point/dot ( maybe starting point, mid point, ending point).
Path is a line( sequence of point/dot makes a line).
A graph is two tuple G = (V, E), where:
V -> set of vertices (points/nodes or whatever you call it)
E -> set of edges (a line which connects any two vertices)
Such that: (v,u) belongs to E (set of edges) => v, u belongs to V (set of vertices).
Now, when we talk about paths: These are series of connected edges, which starts from a vertex and ends in another vertex.
Then you have several types of graphs : i.e. Connected/disconnected directed/undirected weighted/unweighted graphs.
Further reading : http://en.wikipedia.org/wiki/Graph_(mathematics)
Hope it helps!!
An edge connects two nodes and path is sequence of nodes and edges.

Resources