Finding the number of paths of given length in a undirected unweighted graph - algorithm

'Length' of a path is the number of edges in the path.
Given a source and a destination vertex, I want to find the number of paths form the source vertex to the destination vertex of given length k.
We can visit each vertex as many times as we want, so if a path from a to b goes like this: a -> c -> b -> c -> b it is considered valid. This means there can be cycles and we can go through the destination more than once.
Two vertices can be connected by more than one edge. So if vertex a an vertex b are connected by two edges, then the paths , a -> b via edge 1 and a -> b via edge 2 are considered different.
Number of vertices N is <= 70, and K, the length of the path, is <= 10^9.
As the answer can be very large, it is to be reported modulo some number.
Here is what I have thought so far:
We can use breadth-first-search without marking any vertices as visited, at each iteration, we keep track of the number of edges 'n_e' we required for that path and product 'p' of the number of duplicate edges each edge in our path has.
The search search should terminate if the n_e is greater than k, if we ever reach the destination with n_eequal to k, we terminate the search and add p to out count of number of paths.
I think it we could use a depth-first-search instead of breadth first search, as we do not need the shortest path and the size of Q used in breadth first search might not be enough.
The second algorithm i have am thinking about, is something similar to Floyd Warshall's Algorithm using this approach . Only we dont need a shortest path, so i am not sure this is correct.
The problem I have with my first algorithm is that 'K' can be upto 1000000000 and that means my search will run until it has 10^9 edges and n_e the edge count will be incremented by just 1 at each level, which will be very slow, and I am not sure it will ever terminate for large inputs.
So I need a different approach to solve this problem; any help would be greatly appreciated.

So, here's a nifty graph theory trick that I remember for this one.
Make an adjacency matrix A. where A[i][j] is 1 if there is an edge between i and j, and 0 otherwise.
Then, the number of paths of length k between i and j is just the [i][j] entry of A^k.
So, to solve the problem, build A and construct A^k using matrix multiplication (the usual trick for doing exponentiation applies here). Then just look up the necessary entry.
EDIT: Well, you need to do the modular arithmetic inside the matrix multiplication to avoid overflow issues, but that's a much smaller detail.

Actually the [i][j] entry of A^k shows the total different "walk", not "path", in each simple graph. We can easily prove it by "mathematical induction".
However, the major question is to find total different "path" in a given graph.
We there are a quite bit of different algorithm to solve, but the upper bound is as follow:
(n-2)*(n-3)*...(n-k) which "k" is the given parameter stating length of path.

Let me add some more content to above answers (as this is the extended problem I faced). The extended problem is
Find the number of paths of length k in a given undirected tree.
The solution is simple for the given adjacency matrix A of the graph G find out Ak-1 and Ak and then count number of the 1s in the elements above the diagonal (or below).
Let me also add the python code.
import numpy as np
def count_paths(v, n, a):
# v: number of vertices, n: expected path length
paths = 0
b = np.array(a, copy=True)
for i in range(n-2):
b = np.dot(b, a)
c = np.dot(b, a)
x = c - b
for i in range(v):
for j in range(i+1, v):
if x[i][j] == 1:
paths = paths + 1
return paths
print count_paths(5, 2, np.array([
np.array([0, 1, 0, 0, 0]),
np.array([1, 0, 1, 0, 1]),
np.array([0, 1, 0, 1, 0]),
np.array([0, 0, 1, 0, 0]),
np.array([0, 1, 0, 0, 0])
])

Related

How can I know all possible ways that the edges are connected if I know the toposort in a graph?

How can I know all possible ways that the edges are connected if I know the topological sort?
Here is the original problem:
Now little C has topologically sorted a simple (no heavy edges) directed acyclic graph, but accidentally lost the original. Except for topological sequences, little C only remembers that the original graph had the number of edges k, and that there is one vertex u in the graph that can reach all the other vertices. He wants to know how many simple directed acyclic graphs there are that satisfy the above requirements. Since the answer may be large, you only need to output the remainder of the answer module m.
I have just learned the topological sort. I wonder how I can use it in an upside down way? I know the final toposorted way as (1 2 3 4) and there is one vertex that connects all other vertexes, and there are 4 edges in all, but I need the number of all possible ways that edges are linked.
I think this problem has something to deal with permutation number,and the specific u has to be the first in the toposorted list.
NOTICE the max of m can be up to 200'000,so definitely you can not brute force this problem!!!
Let the topological order be u = 1, 2, …, n. Since 1 can reach all other
vertices, the topological order begins with 1. Each node v > 1, being
reachable from u, must have arcs from one or more nodes < v. These
choices are linked only by the constraint on the number of arcs.
We end up computing Count[v][m] (modulo whatever the modulus is) as
the number of reconstructions on 1, 2, …, v with exactly m arcs. The
answer is Count[n][k].
Count[1][0] = 1 if m == 0 else 0
for v > 1, Count[v][m] = sum for j = 1 to min(m, v-1) of (v-1 choose j)*Count[v-1][m-j]

Finding two minimum spanning trees in graph such that their sum is minimal

I'm trying to solve pretty complex problem with graphs, namely we have given undirected graph with N(N <= 10)nodes and M (M <= 25)edges.
Let's say we have two sets of edges A and B, we can't have two same edges in both A and B, also there can be edges that wont be used in the any of those sets, each edge is assigned value to it. We want to minimize the total sum of the two trees.
Please note that in both sets A and B the edges should form connected graph with all N nodes.
Example
N = 2, M = 3
Edges: 1 - 2, value = 10, 1 - 2, value: 20, 2 - 1, value 30, we want to return the result 30, in the set A we take the first edge and in set B the second edge.
N = 5
M = 8
Edges: {
(1,2,10),
(1,3,10),
(1,4,10),
(1,4,20),
(1,5,20),
(2,3,20),
(3,4,20),
(4,5,30),
}
set A contains edges {(1,2,10), (1,3,10), (1,4,10), (1,5,20)}
while set B contains {(1,4,20), (2,3,20), (3,4,20), (4,5,30)}
What I tried
Firstly I coded greedy solution, I first generated the first minimum spanning tree and then I generated with the other edges the second one, but it fails on some test cases. So I started thinking about this solution:
We can see that we want to split the edges in two groups, also we can see that in each group we want to have N - 1 edges to make sure the graph doesn't contain not-wanted edges, Now we see that in worse-case we will use (N-1) + (N-1) edges, that is 18 edges. This is small numbers, so we can run backtracking algorithm with some optimizations to solve this problem.
I still haven't coded the backtracking because I'm not sure if it will work, please write what do you think. Thanks in advance.

An efficient solution to find if n vertex disjoint path exist

You have been given an r x c grid. The vertices in i row and j column is denoted by (i,j). All vertices in grid have exactly four neighbors except boundary ones which are denoted by (i,j) if i = 1, i = r , j = 1 or j = c. You are given n starting points. Determine whether there are n vertex disjoint paths from starting points to n boundary points.
My Solution
This can be modeled as a max-flow problem. The starting points will be sources, boundary targets and each edge and vertex will have capacity of 1. This can be further reduced to generic max flow problem by making each vertex split in two, with capacity of 1 in edge between them, and having a supersource and a supersink connected with sources and targets be edge of capacity one respectively.
After this I can simply check whether there exists a flow in each edge (s , si) where s is supersource and si is ith source in i = 1 to n. If it does then the method returns True otherwise False.
Problem
But it seems like using max-flow in this is kind of overkill. It would take some time in preprocessing the graph and the max-flow takes about O(V(E1/2)).
So I was wondering if there exists an more efficient solution to compute it?

Directed graph: checking for cycle in adjacency matrix

There is an alternative method respect to the DFS algorithm to check if there are cycles in a directed graph represented with an adjacency matrix?
I found piecemeal information on the properties of matrices.
Maybe I can multiply matrix A by itself n times, and check if there is non-zero diagonal in each resulting matrix.
While this approach may be right, how can I extract explicitly the list of vertices representing a cycle?
And what about the complexity of this hypothetical algorithm?
Thanks in advance for your help.
Let's say after n iterations, you have a matrix where the cell at row i and column j is M[n][i][j]
By definition M[n][i][j] = sum over k (M[n - 1][i][k] * A[k][j]). Let's say M[13][5][5] > 0, meaning it has a cycle of length 13 starting at 5 and ending at 5. To have M[13][5][5] > 0, there must be some k such that M[12][5][k] * A[k][5] > 0. Let's say k = 6, now you know one more node in the cycle (6). It also follows that M[12][5][6] > 0 and A[6][5] > 0
To have M[12][5][6] > 0, there must be some k such that M[11][5][k] * A[k][6] > 0. Let's say k = 9, now, you know one more node in the cycle (9). It also follows that M[11][5][9] > 0 and A[9][6] > 0
Then, you can do the same repetitively to find other nodes in the cycle.
Depth-first search can be modified to decide the existence of a cycle. The first time that the algorithm discovers a node which has previously been visited, the cycle can be extracted from the stack, as the previously found node must still be on the stack; it would make sense to use a user-defined stack instead of the call stack. The complexity would be O(|V|+|E|), as for unmodified depth-first search itself.

Path finding algorithm on graph considering both nodes and edges

I have an undirected graph. For now, assume that the graph is complete. Each node has a certain value associated with it. All edges have a positive weight.
I want to find a path between any 2 given nodes such that the sum of the values associated with the path nodes is maximum while at the same time the path length is within a given threshold value.
The solution should be "global", meaning that the path obtained should be optimal among all possible paths. I tried a linear programming approach but am not able to formulate it correctly.
Any suggestions or a different method of solving would be of great help.
Thanks!
If you looking for an algorithm in general graph, your problem is NP-Complete, Assume path length threshold is n-1, and each vertex has value 1, If you find the solution for your problem, you can say given graph has Hamiltonian path or not. In fact If your maximized vertex size path has value n, then you have a Hamiltonian path. I think you can use something like Held-Karp relaxation, for finding good solution.
This might not be perfect, but if the threshold value (T) is small enough, there's a simple algorithm that runs in O(n^3 T^2). It's a small modification of Floyd-Warshall.
d = int array with size n x n x (T + 1)
initialize all d[i][j][k] to -infty
for i in nodes:
d[i][i][0] = value[i]
for e:(u, v) in edges:
d[u][v][w(e)] = value[u] + value[v]
for t in 1 .. T
for k in nodes:
for t' in 1..t-1:
for i in nodes:
for j in nodes:
d[i][j][t] = max(d[i][j][t],
d[i][k][t'] + d[k][j][t-t'] - value[k])
The result is the pair (i, j) with the maximum d[i][j][t] for all t in 0..T
EDIT: this assumes that the paths are allowed to be not simple, they can contain cycles.
EDIT2: This also assumes that if a node appears more than once in a path, it will be counted more than once. This is apparently not what OP wanted!
Integer program (this may be a good idea or maybe not):
For each vertex v, let xv be 1 if vertex v is visited and 0 otherwise. For each arc a, let ya be the number of times arc a is used. Let s be the source and t be the destination. The objective is
maximize ∑v value(v) xv .
The constraints are
∑a value(a) ya ≤ threshold
∀v, ∑a has head v ya - ∑a has tail v ya = {-1 if v = s; 1 if v = t; 0 otherwise (conserve flow)
∀v ≠ x, xv ≤ ∑a has head v ya (must enter a vertex to visit)
∀v, xv ≤ 1 (visit each vertex at most once)
∀v ∉ {s, t}, ∀cuts S that separate vertex v from {s, t}, xv ≤ ∑a such that tail(a) ∉ S &wedge; head(a) &in; S ya (benefit only from vertices not on isolated loops).
To solve, do branch and bound with the relaxation values. Unfortunately, the last group of constraints are exponential in number, so when you're solving the relaxed dual, you'll need to generate columns. Typically for connectivity problems, this means using a min-cut algorithm repeatedly to find a cut worth enforcing. Good luck!
If you just add the weight of a node to the weights of its outgoing edges you can forget about the node weights. Then you can use any of the standard algorigthms for the shortest path problem.

Resources