How can I generate a random DFA with uniform distribution? - algorithm

I need to generate a Deterministic Finite Automata (DFA), selected from all possible DFAs that satisfy the properties below. The DFA must be selected with uniform distribution.
The DFA must have the following four properties:
The DFA has N nodes.
Each node has 2 outgoing transitions.
Each node is reachable from every other node.
The DFA is chosen with perfectly uniform randomness from all possibilities.
I am not considering labeling of nodes or transitions. If two DFAs have the same unlabeled directed graph they are considered the same.
Here are three algorithms that don't work:
Algorithm #1
Start with a set of N nodes called A.
Choose a node from A and put it in set B.
While there are nodes left in set A
- 3.1 Choose a node x from set A
- 3.2 Choose a node y from set B with less than two outgoing transitions.
- 3.3 Choose a node z from set B
- 3.4 Add a transition from y to x.
- 3.5 Add a transition from x to z
- 3.6 Move x to set B
For each node n in B
- 4.1 While n has less than two outgoing transitions
- 4.2 Choose a node m in B
- 4.3 Add a transition from n to m
Algorithm #2
Start with a directed graph with N vertices and no arcs.
Generate a random permutation of the N vertices to produce a random Hamiltonian cycle, and add it to the graph.
For each vertex add one outgoing arc to a randomly chosen vertex.
Algorithm #3
Start with a directed graph with N vertices and no arcs.
Generate a random directed cycle of some length between N and 2N and add it to the graph.
For each vertex add one outgoing arc to a randomly chosen vertex.
I created algorithm #3 based off of algorithm #2, however, I don't know how to select the random directed cycle to create a uniform distribution. I don't even know if it's possible.
Any help would be greatly appreciated.

If N is small (there are N^(2N) possible sets of arcs that meet the first two conditions, so you would want this to be less than the range of your random number generator) you can generate random DFAs and discard the ones that don't satisfy the reachability condition.

Related

Graph Theory : Will it stop or not?

I cannot find out how to start with this question:
A graph has n vertices and m edges.No two pairs of vertices can be connected by more than one edge.Rahul starts to play a game:
He changes the edges in following manner -
He selects one vertex, and adds an edge from that vertex towards all other vertex where the edges do not exist.
and at the same time, deletes all pre-existing edges from that vertex.
This game will stop only when their exist a direct edge between every two Vertices. You need to determine whether it is possible to finish this game, or whether this will never happen, no matter what moves he make.
Input : Initial state of graph will be given.
Output :"yes" or "no"
Can someone give a hint on how to start??
1) Order of moves doesn't matter (since you can exchange any two subsequent moves to the same result);
2) Two subsequent changes with the same vertex have zero effect;
3) You can get to final state iff. you can get there changing any vertex no more than once;
4) Any two connected vertices must be both changed or both unchanged, of any two vertices not connected exactly one should be changed.
Take a connected component in the graph. Vertices there should all change or all remain unchanged. If the component isn't fully connected, finishing the game is impossible. If there are at least three connected components, finishing the game is impossible. If there are exactly two fully connected components, all vertices in exactly one of them should be changed.
Answer: the game can be finished if and only if the graph either is already fully connected or consists of two fully connected components. (It's easy to see that when the graph consists of two fully connected components, changing a vertex effectively moves it from one component to another.)
Algorithm to check for the answer: suppose we are given a number of vertices n and then a list of edges in form (a,b) where a and b are vertex numbers from [1,n]. Let vertices be an array of records (num_edges, connected, sample), initialized with (0,k,k) (k is vertex number). Then for each edge (A,B):
increase num_edges for A and B by 1;
if A.sample equals B.sample, go to next edge;
exchange A.connected and B.connected, starting from A.connected vertex set sample to A.sample and follow connected until reaching B; got to next edge.
Finally check that all vertices starting from 1 and following connected have the same num_edges equal to (their number - 1) and all remaining vertices for similar loop. Time should be O(max(n log(n), m)), memory O(n).
A solved graph with n vertices will be a complete graph kn with ½n(n-1) edges.
Flipping the state of a vertex will mean that the vertex becomes disconnected from the graph and there are two disconnected complete sub-graphs K1 and K(n-1) which have contain 0 and ½(n-1)(n-2) edges, respectively.
Flipping the state of other vertices will disconnect each of them from the complete sub-graph containing it and connect it to all the vertices of the other complete sub-graph. So, generally, if there are x vertices that have flipped state then there will be a two complete sub-graphs Kx and K(n-x) with ½x(x-1) and ½(n-x)(n-1-x) edges respectively for a total of m = ½n(n-1) - nx +x(x-1) edges.
If we know m and n then we can solve the quadratic equation to find the number of moves x required to solve the problem:
x = ( n - SQRT( 4m + 2n - n² ) ) / 2
If x is a non-integer then the puzzle is not solvable.
If x is an integer then the puzzle may be solvable in exactly x moves but an additional check is needed to see if there is are two disconnected complete sub-graphs Kx and K(n-x).
Algorithm:
Calculate x; if it is not an integer then the graph is not solvable. [Complexity: O(1)]
Pick a random vertex:
its degree should be either (x-1) or (n-x-1); if it is not then the graph is not solvable.
generate a list of all its adjacent vertices. [Complexity: O(n)]
perform a depth-first (or breadth-first) search from that vertex. [Complexity: O(n+m)]
if the number of edges visited is ½x(x-1) or ½(n-x)(n-1-x) (corresponding to the degree of the original vertex) and no vertices are visited that were not adjacent to the original then the sub-graph is complete and you know the graph is solvable; if either condition is not true then the graph is not solvable.
To be certain you could do the same for the other sub-graph but it is unnecessary.
Examples:
The graph where n=4,m=2 with edges (1,2) and (3,4) is solvable since x = ( 4 - SQRT( 0 ) ) / 2 = 2, an integer, and there are two K2 disconnected sub-graphs.
The graph where n=4,m=3 with edges (1,2), (2,3), (3,4) has x = ( 4 - SQRT( 4 ) ) / 2 = 1, an integer, but there is only one, connected non-complete graph when there should be two disconnected K1 and K3 sub-graphs.

Algorithm - Balancing a disconnected bipartite graph

I have a bipartite graph G. I would like to find a fast algorithm (asymptotically) to find an assignment of all the vertices of G into two sets X and Y such that the complete bipartite graph formed by the sets X and Y has as many edges as possible.
Slightly longer explanation:
G is bipartite and consists of a set connected components (each of which are bipartite, obviously). We want to decide on a positioning (for lack of a better word) of each component into X and Y. After deciding upon all the positionings, we complete the bipartite graph (i.e. we connect every vertex of X to every vertex of Y). We then count out how many edges are there totally (including original edges) and we want to maximize this count. Simple math shows that the number of edges would be |X|*|Y|.
My thought process for a solution:
As the number of components increase, the number of choices for G increases exponentially. However, if we take number of connected components of G to be equal to number of nodes in G, then the solution is simple - split so that number of nodes in X and Y are equal (or almost equal in case of odd number of nodes in G). This makes me want to generalize that the problem is the same thing as trying to minimize the difference in cardinalities of X and Y (which can be solved as in this SO question). However, I have been unsuccessful in proving so.
Let's decompose the problem.
Your graph is actually a set of connected components, each connected
component is (U_i,V_i,E_i).
To maximize the number of edges, you need to maximize the value of
|X|*|Y|
To get the maximal value of |X|*|Y|, you obviously need to use all
vertices (otherwise, by adding another vertex, you get a bigger value).
Your freedom of choice is actually to choose for each component i - if you should add U_i to X, and V_i to Y - or vise versa.
So, what you are actually trying to do is:
maximize:
sum { x_i * |V_i| : for each component i} * sum { y_i * |U_i| : for each component i}
subject to constraints:
x_i, y_i in {0,1} for all i
x_i + y_i = 1 for all i
The value we want to maximize behaves similar to the function f(x) = x(k-x), because if we increase |X|, it comes at the expanse of decreasing |Y|, and by the same amount. This function has a single maximum:
f(x) = xk - x^2
f'(x) = k - 2x = 0 ---> x = k/2
Meaning, we should distribute the nodes such that the cardinality (size) of X and Y are closest as possible to each other (and use all the vertices).
This can be reduced to Partition Problem:
Given U_1,V_1,U_2,V_2,...,U_k,V_k
Create an instance of partition problem:
abs(|U_1| - |V_1|), abs(|U_2| - |V_2|), ... , abs(|U_k| - |V_k|)
Now, the optimal solution found to partition problem can be translated directly to which of U_i,V_i to include in which set, and will make sure the difference in sizes is kept to minimum.

NP-Complete reduction

The problem states that we want to show that Independent Set poly-time reduces to Relative Prime Sets, more formally Independent Set <p Relative Prime Sets.
I need to provide a reduction f from ind.set to rel. prime sets, where
- input of f must be a Graph G and an integer k, where k denotes the size of an independent set.
- output of f must be a set S of integers and an integer t, where t denotes the number of pairwise relative prime numbers in the set S.
Definition of relative prime sets (decision version):
it takes a set P of n-integers and an integer t from 1 to n.
returns yes if there's a subset A of P, with t-many pairwise relative
primes. That is, for all a, b in A, it must be true that gcd(a, b) =
1.
returns no otherwise
So far I have come-up with what I believe is a reduction, but I am not sure if it is valid and I want to double check it with someone who knows how to do this.
Reduction:
Let G be a graph.Let k indicate the size of an independent set. Then we
want to find-out if there exists an independent set of size k in G.
Since this problem is NP-Complete, if we can solve another NP-Complete
problem in poly-time, we know that we can also solve Independent Set
in poly-time. So we chose to reduce independent set to Relative Prime
Sets.
We take the graph G and label its vertices from 1 to n as pr the
definition of the input for relative prime sets. Then we find the gcd
of each node to every other node in G. We draw an edge between the
nodes that have gcd(a, b) = 1. When the graph is complete, we look at
the nodes and determine which nodes are not connected to each other
via an edge. We create sets for those nodes. We return the set
containing the most nodes along with an integer t denoting the number
of integers in the set. This is the set of the most relative prime
numbers in the graph G and also the greatest independent set of G.
Suppose two graphs, each of four nodes. On graph one, the nodes are connected in a line so that the max-independent set is 2. Graph two is a complete graph each node is connected to each other node, so the max-independent set is 1.
It sounds like your reduction would result in the same set for each graph, leading to an incorrect result for independent set.
equation,S=k*lnW discrete logaritm can`t be broken because is corelated with informational entropy

Partitioning graph into 2 sets of vertices

I want to partition a connected graph into 2 sets of vertices, such that the difference of sum of edge-weights among vertices of each set is minimized.
For example, if a graph consists of vertices 1,2,3,4,5, consider this partition:
Set A - {1,2,3}
Set B - {4,5}
Sum A = {w(1 2) + w(2 3) + w(1 3)}
Sum B = {w(4 5)}
Diff = abs(Sum A - Sum B) ... (This is one possible partition difference.)
So, how do I find a partition such that the difference is minimized?
This problem is NP hard because it is at least as hard as the partition problem.
Sketch of proof
Consider a partition problem where we have the numbers {1,2,3,4,5} that we wish to partition into two sets with as small a difference as possible.
Construct the graph shown below:
If someone comes up with an algorithm to solve your problem you can use the algorithm to partition this graph into two sets such that the sum of weights within each set is minimized.
In the optimal solution the blue and green nodes must be placed into different sets (because we have an edge with weight infinity connecting them). The remaining nodes will be connected to either the blue or green nodes. Call the ones connected to blue set1, and the ones connected to green set2. This partition will give the optimal answer to the partition problem.
Greedy algorithm
However, depending on the structure of your graph and values of the weights you may well be able to do a reasonable job.
For example, you could try:
Choose a random permutation of vertices
Loop through each vertex and assign to set 1 or 2 according to whichever minimises the objective function (which is just evaluated over the vertices assigned so far)
Repeat this algorithm a few times and keep track of the best score.
When you get down to just a few vertices left to be assigned, you could also try a brute force evaluation of all possible partitions of the remaining vertices to search for a good solution.
The following algorithmic sketch is based on Iterated Local Search. The idea is to greedily optimize the current solution until a local optimal solution is found. Then disturb this solution to overcome the local optimal solution. Always keep track of the best solution found so far.
Randomly divide the set of vertice into V1 and V2
Iterate
Calculate the costs (edge-weight-difference) of your current division
Select two random vertices v1 from V1 and v2 from V2
Check whether swapping these vertices (move v1 to V2 and v2 to V1) would lead to lower costs (edge-weight-difference). If so, swap vertices v1 and v2, else keep the sets.
Disturb a converged solution by swapping half of the vertices in V1 with half of the vertices in V2. Goto 2.
Iterated Local Search is a surprisingly effective and practical heuristic -- even for NP-complete problems.

What is the algorithm for generating a random Deterministic Finite Automata?

The DFA must have the following four properties:
The DFA has N nodes
Each node has 2 outgoing transitions.
Each node is reachable from every other node.
The DFA is chosen with perfectly uniform randomness from all possibilities
This is what I have so far:
Start with a collection of N nodes.
Choose a node that has not already been chosen.
Connect its output to 2 other randomly selected nodes
Label one transition 1 and the other transition 0.
Go to 2, unless all nodes have been chosen.
Determine if there is a node with no incoming connections.
If so, steal an incoming connection from a node with more than 1 incoming connection.
Go to 6, unless there are no nodes with no incoming connections
However, this is algorithm is not correct. Consider the graph where node 1 has its two connections going to node 2 (and vice versa), while node 3 has its two connection going to node 4 (and vice versa). That is something like:
1 <==> 2
3 <==> 4
Where, by <==> I mean two outgoing connections both ways (so a total of 4 connections). This seems to form 2 cliques, which means that not every state is reachable from every other state.
Does anyone know how to complete the algorithm? Or, does anyone know another algorithm? I seem to vaguely recall that a binary tree can be used to construct this, but I am not sure about that.
Strong connectivity is a difficult constraint. Let's generate uniform random surjective transition functions and then test them with e.g. Tarjan's linear-time SCC algorithm until we get one that's strongly connected. This process has the right distribution, but it's not clear that it's efficient; my researcher's intuition is that the limiting probability of strong connectivity is less than 1 but greater than 0, which would imply only O(1) iterations are necessary in expectation.
Generating surjective transition functions is itself nontrivial. Unfortunately, without that constraint it is exponentially unlikely that every state has an incoming transition. Use the algorithm described in the answers to this question to sample a uniform random partition of {(1, a), (1, b), (2, a), (2, b), …, (N, a), (N, b)} with N parts. Permute the nodes randomly and assign them to parts.
For example, let N = 3 and suppose that the random partition is
{{(1, a), (2, a), (3, b)}, {(2, b)}, {(1, b), (3, a)}}.
We choose a random permutation 2, 3, 1 and derive a transition function
(1, a) |-> 2
(1, b) |-> 1
(2, a) |-> 2
(2, b) |-> 3
(3, a) |-> 1
(3, b) |-> 2
In what follows I'll use the basic terminology of graph theory.
You could:
Start with a directed graph with N vertices and no arcs.
Generate a random permutation of the N vertices to produce a random Hamiltonian cycle, and add it to the graph.
For each vertex add one outgoing arc to a randomly chosen vertex.
The result will satisfy all three requirements.
There is a expected running time O(n^{3/2}) algorithm.
If you generate a uniform random digraph with m vertices such that each vertex has k labelled out-arcs (a k-out digraph), then with high probability the largest SCC (strongly connected component) in this digraph is of size around c_k m, where c_k is a constant depending on k. Actually, there is about 1/\sqrt{m} probability that the size of this SCC is exactly c_k m (rounded to an integer).
So you can generate a uniform random 2-out digraph of size n/c_k, and check the size of the largest SCC. If its size is not exactly n, just try again until success. The expected number of trials needed is \sqrt{n}. And generating each digraph should be done in O(n) time. So in total the algorithm has expected running time O(n^{3/2}). See this paper for more details.
Just keep growing a set of nodes which are all reachable. Once they're all reachable, fill in the blanks.
Start with a set of N nodes called A.
Choose a node from A and put it in set B.
While there are nodes left in set A
Choose a node x from set A
Choose a node y from set B with less than two outgoing transitions.
Choose a node z from set B
Add a transition from y to x.
Add a transition from x to z
Move x to set B
For each node n in B
While n has less than two outgoing transitions
Choose a node m in B
Add a transition from n to m
Choose a node to be the start node.
Choose some number of nodes to be accepting nodes.
Every node in set B can reach every node in set B. As long as a node can be reached from a node in set B and that node can reach a node in set B, it can be added to the set.
The simplest way that I can think of is to (uniformly) generate a random DFA with N nodes and two outgoing edges per node, ignoring the other constraints, and then throw away any that are not strongly connected (which is easy to test using a strongly connected components algorithm). Generating uniform DFAs should be straightforward without the reachability constraint. The one thing that could be problematic performance-wise is how many DFAs you would need to skip before you found one with the reachability property. You should try this algorithm first, though, and see how long it ends up taking to generate an acceptable DFA.
We can start with a random number of states N1 between N and 2N.
Assume the initial state the as the state number 1.
For each state, for each character in the input alphabet we generate a random transition (between 1 and N1).
We take the connex automaton starting from the initial state. We check the number of states, and after few tries we get one with N states.
If we wish a minimal automaton too, remains only the assignment of final states, however there are great chances that a random assignment gets a minimal automaton as well.
The following references seem to be relevant to your question:
F. Bassino, J. David and C. Nicaud, Enumeration and random generation of possibly incomplete deterministic automata, Pure Mathematics and Applications 19 (2-3) (2009) 1-16.
F. Bassino and C. Nicaud. Enumeration and Random Generation of Accessible Automata. Theor. Comp. Sc.. 381 (2007) 86-104.

Resources