how do I reduce this spanning tree problem to np-completeness? - algorithm

I have the following algorithmic problem:
If I have a graph G=(V,E), does G have a spanning tree with exactly k leaves?
Leaves being a vertex with only one neighbor in the spanning tree.
Also, i'm not looking for a minimum spanning tree, just a spanning tree.
TO sum up, a solution algortihm would take as inputs a graph G and a number k, and return either true or false, depending on whether G has a spanning tree of k leaves
Example:
For this graph:
if k is 6, then my algorithm would output "True" because:
Now I am pretty sure that this problem is np-complete, so I need to perform a reduction from a know np-complete problem.
I just have no idea which problem, and how the reduction should look like, can you help out?

The Hamiltonian path problem is a special case of your problem - a spanning tree with exactly k = 2 leaves is a Hamiltonian path. Testing for the existence of one is NP-complete.

Not a real answer to your question, but you might want to try to simplify the graph before you go on board on those 1.x^N algorithms
Simplifying things (untested code ahead)
if (nodes.size() < K)
return false;
Remove all nodes with only one edge as they are forced to be leaves.
while (nodes && nodes.front().edges.size() == 1) {
nodes.erase(nodes.begin()); // updates one other node which could have 1 edge then.
K--;
}
if (K < 0 || nodes.size() < K)
return false;
Remove all nodes which have 2 edges and where removing one would disconnect the graph, connect the two nodes it connected to directly. It is not a bridge if there is any path from edge1 to edge2. O(N^2)
node = nodes.begin();
while (node->edges.size() == 2) {
if (DisconnectingBrigde(node)) {
edges = node->edges;
node = nodes.erase(node); // returns next node
nodes.addEgde(edges.front(), edges.back()); // connect the two parts
} else
node++; // next node
}

Related

Using graph traversal to test is a graph is a perfect binary tree

When given an undirected graph G represented by an adjacency list how can you use a DFS to see if that graph is a perfect binary tree?
I have been able to identify edge cases: such as using the fact that for a depth D you need 2^n-1 nodes you can can work out the max depth using a logarithm and if that isn't whole you know you don't have a perfect tree but I cant think of an efficient way of using the adjacency list and DFS to test.
In a perfect binary tree that is not empty, with 𝑛 nodes, we have these properties:
The number of nodes 𝑛 is one less than a power of 2, i.e. ℎ=log2(𝑛+1) is integer. 𝑛=2ℎ−1
The number of edges is 𝑛−1
There are no nodes with more than 3 neighbors.
When 𝑛 > 1, there is (only) one node with exactly 2 neighbors: it is the root.
When 𝑛 > 1, the leaves of the tree have only one neighbor: there are 2ℎ-1 of them.
The distance between the root and any leaf is ℎ−1.
These properties can be checked one after the other. Once you have identified the root, you can perform a traversal to check the distance property. Either with DFS or BFS.
If the graph is empty, or has only one vertex, then return true.
Otherwise, check to make sure the graph is connected and acyclic.
Then, if it's a perfect binary tree, there must be only one vertex of degree 2. That's the root. Let a and b be its two children. Then:
let depthA = depthIfPerfect(a, root);
let depthB = depthIfPerfect(b, root);
return depthA == depthB && depthA >=0
where:
depthIfPerfect(node, parent):
if degree(node) == 1:
return 1;
if degree(node) != 3:
return -1; //not perfect
let a and b be the neighbors that aren't parent
let depthA = depthIfPerfect(a, node);
let depthB = depthIfPerfect(b, node);
if (depthA != depthB || depthA < 0):
return -1: //not perfect
return depthA+1;
You can mix the check for connectedness and acyclicity into this traversal if you like.

Linear-time algorithm for number of distinct paths from each vertex in a directed acyclic graph

I am working on the following past paper question for an algorithms module:
Let G = (V, E) be a simple directed acyclic graph (DAG).
For a pair of vertices v, u in V, we say v is reachable from u if there is a (directed) path from u to v in G.
(We assume that every vertex is reachable from itself.)
For any vertex v in V, let R(v) be the reachability number of vertex v, which is the number of vertices u in V that are reachable from v.
Design an algorithm which, for a given DAG, G = (V, E), computes the values of R(v) for all vertices v in V.
Provide the analysis of your algorithm (i.e., correctness and running time
analysis).
(Optimally, one should try to design an algorithm running in
O(n + m) time.)
So, far I have the following thoughts:
The following algorithm for finding a topological sort of a DAG might be useful:
TopologicalSort(G)
1. Run DFS on G and compute a DFS-numbering, N // A DFS-numbering is a numbering (starting from 1) of the vertices of G, representing the point at which the DFS-call on a given vertex v finishes.
2. Let the topological sort be the function a(v) = n - N[v] + 1 // n is the number of nodes in G and N[v] is the DFS-number of v.
My second thought is that dynamic programming might be a useful approach, too.
However, I am currently not sure how to combine these two ideas into a solution.
I would appreciate any hints!
EDIT: Unfortunately the approach below is not correct in general. It may count multiple times the nodes that can be reached via multiple paths.
The ideas below are valid if the DAG is a polytree, since this guarantees that there is at most one path between any two nodes.
You can use the following steps:
find all nodes with 0 in-degree (i.e. no incoming edges).
This can be done in O(n + m), e.g. by looping through all edges
and marking those nodes that are the end of any edge. The nodes with 0
in-degree are those which have not been marked.
Start a DFS from each node with 0 in-degree.
After the DFS call for a node ends, we want to have computed for that
node the information of its reachability.
In order to achieve this, we need to add the reachability of the
successors of this node. Some of these values might have already been
computed (if the successor was already visited by DFS), therefore this
is a dynamic programming solution.
The following pseudocode describes the DFS code:
function DFS(node) {
visited[node] = true;
reachability[node] = 1;
for each successor of node {
if (!visited[successor]) {
DFS(successor);
}
reachability[node] += reachability[successor];
}
}
After calling this for all nodes with 0 in-degree, the reachability
array will contain the reachability for all nodes in the graph.
The overall complexity is O(n + m).
I'd suggest using a Breadth First Search approach.
For every node, add all the nodes that are connected to the queue. In addition to that, maintain a separate array for calculating the reachability.
For example, if a A->B, then
1.) Mark A as traversed
2.) B is added to the queue
3.) arr[B]+=1
This way, we can get R(v) for all vertices in O(|V| + |E|) time through arr[].

Minimum Spanning tree different from another

Assume we are given
an undirected graph g where every node i,1 <= i < n is connected to all j,i < j <=n
and a source s.
We want to find the total costs (defined as the sum of all edges' weights) of the cheapest minimum spanning tree that differs from the minimum distance tree of s (i.e. from the MST obtained by running prim/dijkstra on s) by at least one edge.
What would be the best way to tackle this? Because currently, I can only think of some kind of fixed-point iteration
run dijkstra on (g,s) to obtain reference graph r that we need to differ from
costs := sum(edge_weights_of(r))
change := 0
for each vertex u in r, run a bfs and note for each reached vertex v the longest edge on the path from u to v.
iterate through all edges e = (a,b) in g: and find e'=(a',b') that is NOT in r and minimizes newchange := weight(e') - weight(longest_edge(a',b'))
if(first_time_here OR newchange < 0) then change += newchange
if(newchange < 0) goto 4
result := costs + change
That seems to waste a lot of time... It relies on the fact that adding an edge to a spanning tree creates a cycle from which we can remove the longest edge.
I also thought about using Kruskal to get an overall minimum spanning tree and only using the above algorithm to replace a single edge when the trees from both, prim and kruskal, happen to be the same, but that doesn't seem to work as the result would be highly dependent on the edges selected during a run of kruskal.
Any suggestions/hints?
You can do it using Prim`s algorithm
Prim's algorithm:
let T be a single vertex x
while (T has fewer than n vertices)
{
1.find the smallest edge connecting T to G-T
2.add it to T
}
Now lets modify it.
Let you have one minimum spanning tree. Say Tree(E,V)
Using this algorithm
Prim's algorithm (Modified):
let T be a single vertex
let isOther = false
while (T has fewer than n vertices)
{
1.find the smallest edge (say e) connecting T to G-T
2.If more than one edge is found, {
check which one you have in E(Tree)
choose one different from this
add it to T
set isOther = true
}
else if one vertex is found {
add it to T
If E(Tree) doesn`t contain this edge, set isOther = true
Else don`t touch isOther ( keep value ).
}
}
If isOther = true, it means you have found another tree different from Tree(E,V) and it is T,
Else graph have single minimum spanning tree

Weighted, Undirected Adjacency List: Maximum Weighted Edge in a Single Cycle in O(n)

So this problem involves an AdjacencyList based graph G. This graph has exactly n edges and n vertices. It also has one, and only one cycle. What is the fastest possible algorithm (as far as big O notation) to find the edge with the maximum weight in the cycle?
I'm pretty sure this can be done in O(n), but I'm struggling to figure out the specifics, considering that you must verify that your result is in a cycle. The original way I thought through this problem was a simple depth first search, which you could use to find the maximum weighted edge in the entire graph in O(n) time (since V+E = 2n). You could then do another search to verify whether or not this edge was in the cycle. If it is, then you have your answer in O(n), but if it is not it will take O(n^2) time. This is definitely not ideal though and I'm looking for an O(n) solution.
You can return in the DFS which node was found in the cycle, then go back marking every node up in the DFS tree as part of the cycle (until the found node itself). Something like this:
DFS(v):
mark v as visited
for edges (v, w) in E:
if w is not visited:
last_node = DFS(w)
if last_node != -1:
test (v, w) as maximum edge
if last_node != v:
return last_node
else:
return -1
else:
test (v, w) as maximum edge
return w
return -1

Count paths with Topological Sort

I have a DAG and I need to count all the paths since any node to another node, I've researched a little bit and I found that it could be done with some Topological Order, but so far the solutions are incomplete or wrong.
So how is the correct way to do it?.
Thanks.
As this is a DAG you can topologically sort the nodes in O(V+E) time. Let's assume the source vertex is S. Then from S start traversing the nodes in depth first fashion. When we're processing node U , let's assume there's an edge U->V then V is of course not yet visited (why? because it's an directed acyclic graph) So you can reach from S to V via node U in d[U] ways where d[U] is the number of paths from S to U.
So number of paths from S to any node V, d[V] = d[x1]+d[x2]+d[x3]+ . . . +d[xy], where there are edge like x1->V, x2->V, . . . xy->V
This algorithm will take O(V+E) to topologically sort the graph and then for calculating number of paths at most O(V*E ). You can further reduce its run time of calculating number of path to O(V+E) using adjacency list instead of adjacency matrix and this is the most efficient solution so far.
You can use recursion to count all of the paths in a tree/DAG. Here is the pseudocode:
function numPaths(node1, node2):
// base case, one path from node to itself
if (node1 == node2): return 1
totalPaths = 0
for edge in node1.edges:
nextNode = edge.destinationNode
totalPaths += numPaths(nextNode, node2)
return totalPaths
Edit:
A good dynamic approach to this problem is the Floyd-Warshall algorithm.
Assume G(V,E)
Let d[i][j] = the number of all the paths from i to j
Then d[i][j]= sigma d[next][j] for all (i,next) in E
It seems too slow? Okay. Just memorise it(some guys call it dynamic programming). Like this
memset(d,-1,sizeof(d))// set all of elements of array d to -1 at the very beginning
saya(int i,int j)
{
if (d[i][j]!=-1) return d[i][j];//d[i][j] has been calculated
if (i==j) return d[i][j]=1;//trivival cases
d[i][j]=0;
for e in i.edges
d[i][j]+=saya(e.next,j);
return d[i][j];
}
Now saya(i,j) will return the number of all the paths from i to j.

Resources