Efficient algorithm for loop finding in graphs - algorithm

I have to study the resistance of the principal cluster of a percolating network of conducting wires
. Individual wires are labeled from 1 to n. I represent the network by a graph G(V,E) and find its adjacency matrix A, where A_ij = 1 if wires i and j are in contact, 0 otherwise.
My question is the following : given that I need to implement Kirchhoff's Laws
on the main percolated cluster, I need an algorithm that returns all the, ideally, smallest loops in the cluster. Do you know of an algorithm (mine is bruteforce now and not efficient) that finds all the loops inside a graph from its adjacency matrix ?

In general, there can be exponentially many simple cycles (loops), so since you want only the "smallest", it sounds as though you don't want them all. If you're looking to write equations corresponding to Kirchhoff's second law for all possible cycles, then it suffices to use just the equation for each cycle in a cycle basis. There is a polynomial-time algorithm to find the cycle basis that uses the least total number of edges (minimum cycle basis). Rather than implement that algorithm, however, it may suffice to switch from arc variables xu→v to differences of node variables yv - yu (fix one node variable per connected component to be zero).

Related

Finding size of largest connected component of a graph

Consider we have a random undirected graph G = (V,E) with n vertices, now suppose for any two vertices u and v ∈ V, the probability that the edge between u and v ∈ E is 1/n. We need to figure out the size of the largest connected component in the undirected graph C(n).
C(n) should be equal to Θ(n**a), we need to run some experiments to give an estimate of a.
I am a bit confused on how to link the probability 1/n to the largest connected component, is there any way I can do so?
The process you're simulating here is called the Erdős–Rényi model. You have a collection of n nodes, and each pair of nodes has probability p of being linked. The (expected) shape of the resulting graph depends heavily on the choice of p, and there are a lot of famous results about this.
As for how to do this: one option would be to create a collection of n nodes, iterate over all pairs of nodes, and link them with probability 1/n. You can then run an algorithm like BFS or DFS over the graph to find and size the connected components.
Another would be to use the above approach, except that instead of doing a BFS or DFS to use a disjoint-set forest to perform the links and find the largest connected component.
Alternatively, because each edge is absent or present with equal probability and independently of every other edge, the number of edges you have is binomially distributed and pretty tightly packed around n total edges. You could therefore generate n random edges, add them into the graph, then use the above techniques. (This will be much faster, as this does O(n) work rather than O(n2) work to process the edges.)
Once you've gotten this worked out, you can vary n over a large range and run some sort of polynomial regression on it to find the best-first curve. That's something you could either code up yourself, or which you could do by importing your data into Excel and using its regression tools.
As a spoiler, when you're done you'll find that the number of nodes in the largest connected component is Θ(n2/3). If you search for "Erdős–Rényi critical case," you can find online proofs of this result. It's not a trivial result to prove (and definitely isn't obvious!), but it'll drop out of your empirical analysis.

Which algorithm should match this specific Graph

specific question here. Suppose you have a graph where each vertice specifies how many connections they must have to another vertices and the following rules/properties apply:
1- The graph can be incomplete (no need to every vertice to have a connection with every other)
2- There can be two connections between two vertices only if they are in opposite directions (e.g: A points do B, B points to A).
3- Suppose they are on a 2D plane, there can be no crossing of connections (not even tangents).
4- Theres no interest for the shortest path, just respecting the properties and knowing if the solution is unique or not.
5- There can be no possible solution
EDIT: Alright guys sorry for not being specific. I'll try to clarify my point here: what I want to do is given a number of vertices, know if a graph is connected (if all the points have at least a connection to the graph). The vertices given can be impossible to make a graph of it so I want to know if there's is a solution, if the solution is unique or not or (worst case scenario) if there is no possible solution. I think that clarifies point 4 and 5. The graph is undirected, the connections can Not curve, only straight lines.The Nodes (vertices) are fixed, we have their position from or W/E input. I wanted to know the best approach and I've been researching and it is a connectivity problem, though maybe some specific alg may be more efficient doing this task. That's all, sorry for late reply
EDIT2: Alright guys would the problem be different if we think that each vertice is on a row and column of a plane matrix and they can only connect with other Vertices on the same column or row? So it would be just 90/180/270/360 straight connections. This would hugely shorten the possibilities right?
I am going to assume that the question is: Given the degree of each vertex, work out a graph that passes all the constraints given.
I think you can reduce this to a very large integer programming problem - linear constraints, but with the variables required to be integers (in fact either 0 or 1), which makes the problem much more difficult than ordinary linear programming.
Let the unknowns be of the form Xij, where Xij is 1 if there is an edge from node i to node j, and 0 otherwise. The requirements on the number of connections then amount to requirements of the form SUM_{all i}Xij = K for some K dependent on the requirement. The requirement that the graph is planar reduces to the requirement that the graph not contain two known graphs as subgraphs - https://en.wikipedia.org/wiki/Graph_minor. Each possible subgraph then produces a constraint such as X01 + X02 + ... < 5 - there will be a huge number of these constraints - so large that for large number of nodes simply producing all the constraints may be too expensive to be practical, let alone solving them. The number of constraints goes up as at least the 6th power of the number of nodes. However this is polynomial, so theoretically practical to write down the MIP to be solved - so perhaps this is better than no algorithm at all.
Assuming that you are asking us to:
Find out if it is possible to generate one-or-more directed planar graphs such that each vertex has a given out-degree (not necessarily the same out-degree per vertex).
Let's also assume that you want the graph to be connected.
If there are n vertices and the vertices have degrees d_1 ... d_n then for vertex i there are C(n-1,d_i) = (n-1)!/((d_i)!*(n-1-d_i)!) possible combinations of out-edges from that vertex. Taking the product of all these combinations over all the vertices will give you the upper bound on the number of possible graphs.
The naive approach is:
Generate all possible graphs.
Filter the graphs to only have connected graphs.
Run a planarity test on the graph to determine if it is planar (you can consider the graph to be undirected in this step); discard if it isn't.
Profit!

A network flow with different constraints

Considering a simple network flow model: G = (V,E), source node S, and sink node T. For each edge E[i], its capacity is C[i].
Then the flow F[i] on edge E[i] is constrained to be either C[i] or 0, that is, F[i] belongs to {0, C[i]}.
How to compute the maximum flow from S to T? Is this still a network flow problem?
The decision variant of your modified flow problem is NP-complete, as evidenced by the fact that the subset sum problem can be reduced to it: For given items w_1, ..., w_n and a sum W, just create a source S connected to every item i via an edge S -> i of capacity w_i. Then connect every item i to a sink t via another edge i -> t of capacity w_i. Add an edge t -> T of capacity W. There exists a subset of items with cumulative weight W iif the S-T max-flow in the graph is W with your modifications.
That said, there is likely no algorithm that solves this problem efficiently in every case, but for instances not specifically designed to be hard, you can try an integer linear program formulation of the problem and use a general ILP solver to find a solution.
There might be a pseudopolynomial algorithm if your capacities are integers bounded by a value polynomial in the input size.
Um, no its no longer a well defined flow problem, for the reason that Heuster gives, which is that given two edges connected through a node (with no other connections) the flow must be zero unless the two capacities equal each other. Most generic flow algorithms will fail as they cannot sequentially increase the flow.
Given the extreme restrictivity of this condition on a general graph, I would fall back on a game tree working backwards from the sink. Most nodes of the game tree will terminate quickly as there will be no combination of flows into a node that exactly match the needed outflows. With a reasonable heuristic you can probably find a reasonable search order and terminate the tree without having to search every branch.
In fact, you can probably exclude lots of nodes and remove lots of edges before you start, on the grounds that flows through certain nodes will be trivially impossible.

computing number of nodes which can be reached by a specific node in a directed graph for each node

In a directed graph (suppose it has lots of cycles) I need to compute number of nodes which can be reached by specific node for each node. How can I do that with minimal effort? Which algorithm do I need to use?
Note: I think a reasonable algorithm for this problem should recursively compute this numbers(like result for 'node a' depends on that of 'node b' if a is connected to b).
The algorithm you're looking for is called the Floyd-Warshall algorithm, a very nice and efficient dynamic programming algorithm. It can be used to calculate the set of nodes reachable from each individual node in a graph (the transitive closure), although it's more often used to calculate the shortest paths from each individual node in a graph to all other nodes.
(Edit: the Floyd-Warshall algorithm is more complicated than it needs to be for your uses, because it's been extended a bit by Floyd to calculate shortest paths. You may find this page helpful, which only describes the "Warshall" part of the algorithm - the part you need.)
I happen to be studying it right now for class and have the paper on my desk. The recurrence for the transitive closure version of F-W is:
T(i,j,k) = T(i,j,k-1) ∨ (T(i,k,k-1) ∧ T(k,j,k-1))
Where T(a,b,c) is true if and only if there is a path from a to b using only the first c vertices in the graph (you must give them an arbitrary numbering before running the algorithm).
Intuitively, the recurrence says that there's a path from i to j using the first k vertices if:
there's a direct path between i and j, using the first k-1 vertices, OR
there's a path between i and k, and a path between k and j, using the first k-1 vertices.
You can build up the entire 3-dimensional table of T(i,j,k) in the typical dynamic programming fashion, and then count all of the TRUE entries along the source node that you want (using the max k), to get the size of the transitive closure for that source node.
If you're still following my poor explanation, you can make the algorithm extremely efficient with a few tricks:
It turns out that you don't need the k dimension in your table; you can just overwrite your same row of values over and over. Now the program would look like:
T(i,j) = T(i,j) || (T(i,k) && T(k,j))
If T(i,k) is 0 then you can skip the whole thing since nothing will change on that step.
If T(i,k) is 1 then the new value will just be T(i,j) || T(k,j). This can be done in huge chunks because block OR is extremely fast on modern processors.
Hope that helps...

Is this minimum spanning tree algorithm correct?

The minimum spanning tree problem is to take a connected weighted graph and find the subset of its edges with the lowest total weight while keeping the graph connected (and as a consequence resulting in an acyclic graph).
The algorithm I am considering is:
Find all cycles.
remove the largest edge from each cycle.
The impetus for this version is an environment that is restricted to "rule satisfaction" without any iterative constructs. It might also be applicable to insanely parallel hardware (i.e. a system where you expect to have several times more degrees of parallelism then cycles).
Edits:
The above is done in a stateless manner (all edges that are not the largest edge in any cycle are selected/kept/ignored, all others are removed).
What happens if two cycles overlap? Which one has its longest edge removed first? Does it matter if the longest edge of each is shared between the two cycles or not?
For example:
V = { a, b, c, d }
E = { (a,b,1), (b,c,2), (c,a,4), (b,d,9), (d,a,3) }
There's an a -> b -> c -> a cycle, and an a -> b -> d -> a
#shrughes.blogspot.com:
I don't know about removing all but two - I've been sketching out various runs of the algorithm and assuming that parallel runs may remove an edge more than once I can't find a situation where I'm left without a spanning tree. Whether or not it's minimal I don't know.
For this to work, you'd have to detail how you would want to find all cycles, apparently without any iterative constructs, because that is a non-trivial task. I'm not sure that's possible. If you really want to find a MST algorithm that doesn't use iterative constructs, take a look at Prim's or Kruskal's algorithm and see if you could modify those to suit your needs.
Also, is recursion barred in this theoretical architecture? If so, it might actually be impossible to find a MST on a graph, because you'd have no means whatsoever of inspecting every vertex/edge on the graph.
I dunno if it works, but no matter what your algorithm is not even worth implementing. Finding all cycles will be the freaking huge bottleneck that will kill it. Also doing that without iterations is impossible. Why don't you implement some standard algorithm, let's say Prim's.
Your algorithm isn't quite clearly defined. If you have a complete graph, your algorithm would seem to entail, in the first step, removing all but the two minimum elements. Also, listing all the cycles in a graph can take exponential time.
Elaboration:
In a graph with n nodes and an edge between every pair of nodes, there are, if I have my math right, n!/(2k(n-k)!) cycles of size k, if you're counting a cycle as some subgraph of k nodes and k edges with each node having degree 2.
#Tynan The system can be described (somewhat over simplified) as a systems of rules describing categorizations. "Things are in category A if they are in B but not in C", "Nodes connected to nodes in Z are also in Z", "Every category in M is connected to a node N and has 'child' categories, also in M for every node connected to N". It's slightly more complicated than this. (I have shown that by creating unstable rules you can model a turning machine but that's beside the point.) It can't explicitly define iteration or recursion but can operate on recursive data with rules like the 2nd and 3rd ones.
#Marcin, Assume that there are an unlimited number of processors. It is trivial to show that the program can be run in O(n^2) for n being the longest cycle. With better data structures, this can be reduced to O(n*O(set lookup function)), I can envision hardware (quantum computers?) that can evaluate all cycles in constant time. giving a O(1) solution to the MST problem.
The Reverse-delete algorithm seems to provide a partial proof of correctness (that the proposed algorithm will not produce a non-minimal spanning tree) this is derived by arguing that mt algorithm will remove every edge that the Reverse-delete algorithm will. However I'm not sure how to show that my algorithm won't delete more than that algorithm.
Hhmm....
OK this is an attempt to finish the proof of correctness. By analogy to the Reverse-delete algorithm, we know that enough edges will be removed. What remains is to show that there will not be to many edges removed.
Removing to many edges can be described as removing all the edges between the side of a binary partition of the graph nodes. However only edges in a cycle are ever removed, therefor, for all edge between partitions to be removed, there needs to be a return path to complete the cycle. If we only consider edges between the partitions then the algorithm can at most remove the larger of each pair of edges, this can never remove the smallest bridging edge. Therefor for any arbitrary binary partitioning, the algorithm can't sever all links between the side.
What remains is to show that this extends to >2 way partitions.

Resources