Dominoes matching algorithm - algorithm

Given some inputs, which consist of a left and right symbol, output chains which link the inputs.
Imagine the inputs to be dominoes which you cannot flip horizontally and need to chain them together. Creating big circular chains (ignore if you cannot physically do it with real dominos) is preferred over small circular chains which are preferred over chains where the start and end does not match.
Outputs with more circular chains (regardless of how many or chain length) are what we are looking for. For example, an output of 3 circular chains is better than 1 big chain and a leftover single domino.
Can someone point me in the right direction? What group of problems does this belong and are there existing algorithms for solving this?
Examples (outputs may be incorrect!):
in[0]=(A,B)
in[1]=(B,C)
in[2]=(C,A)
out[0]=(0,1,2)
in[0]=(A,B)
in[1]=(B,A)
in[2]=(C,D)
in[3]=(D,C)
out[0]=(0,1)
out[1]=(2,3)
in[0]=(A,B)
in[1]=(B,A)
in[2]=(C,D)
in[3]=(E,F)
out[0]=(0,1)
out[1]=(2)
out[2]=(3)
in[0]=(A,B)
in[1]=(B,A)
in[2]=(C,D)
in[3]=(D,E)
out[0]=(0,1)
out[1]=(2,3)
in[0]=(A,B)
in[1]=(B,C)
in[2]=(C,D)
out[0]=(0,1,2)

Dominoes which cannot be flipped horizontally == directed graphs.
Putting dominoes one after the other is called a "path", if it is a closed path, it's a circuit.
A circuit that includes all the vertices of a graph is a Hamiltonian circuit.
Your problem in graph theory terms is: how to split (decompose) your graph into a minimum number of subgraphs that have Hamiltonian circuits. (a.k.a. Hamiltonian graphs)

The problem as it is now is not as clearly stated as it could be - how exactly are solutions rated? What is the most important criterion? Is it the length of the longest chain? Is there a penalty for creating chains of length one?
It is often helpful in such problems to visualize the structure as a graph - say, assign a vertex (V[i]) to each tile. Then for each i, j create an edge between vertices V[i], V[j] if you can place V[i] to the left of V[j] in a chain (so if V[i] corresponds to (X, A) then V[j] corresponds to (A, Y) for some X, Y, A).
In such a graph chains are paths, cycles are closed ("circular") chains and the problem has been reduced to finding some cycle and/or path covering for a graph. This type of problems can in turn often be reduced to matching or *-flow problems (max-flow, max-cost-max-flow, min-cost-max-flow or what have you).
But before you can reduce further you have to establish the precise rules according to which one solution is determined to be "better" than another.

It is easy to check whether there exists a circular chain consisting of all dominoes. First you need to make the following directed graph G:
Nodes of G are symbols on the dominoes (A,B,C..) in your example,
For each domino (A,B) you put a directed edge from A to B.
There exists a circular chain consisting of all dominoes iff there exists a Eulerian cycle in G. To check whether there exists Eulerian cycle in G it is sufficient to check weather each node has even degree.

I'm not sure if this is really the case, but judging by your examples, your problem looks similar to the problem of decomposing a permutation into a product of disjoint cycles. Each tile (X,Y) can be seen as P(X) = Y for a permutation P. If this agrees with your assumptions, the good (or bad) news is that such decomposition is unique (up to the cycle order) and is very easy to find. Basically, you start with any tile, find a matching tile on the other hand and follow this until no matching can be found. Then you move to the next untouched point. If that's not what you are looking for, the more general graph-based solution by t.dubrownik looks like the way to go.

Related

Cycle detection that handles a series of directed edges? [duplicate]

I came upon wait-for graphs and I wonder, are there any efficient algorithms for detecting if adding an edge to a directed graph results in a cycle?
The graphs in question are mutable (they can have nodes and edges added or removed). And we're not interested in actually knowing an offending cycle, just knowing there is one is enough (to prevent adding an offending edge).
Of course it'd be possible to use an algorithm for computing strongly connected components (such as Tarjan's) to check if the new graph is acyclic or not, but running it again every time an edge is added seems quite inefficient.
If I understood your question correctly, then a new edge (u,v) is only inserted if there was no path from v to u before (i.e., if (u,v) does not create a cycle). Thus, your graph is always a DAG (directed acyclic graph). Using Tarjan's Algorithm to detect strongly connected components (http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm) sounds like an overkill in this case. Before inserting (u,v), all you have to check is whether there is a directed path from v to u, which can be done with a simple BFS/DFS.
So the simplest way of doing it is the following (n = |V|, m = |E|):
Inserting (u,v): Check whether there is a path from v to u (BFS/DFS). Time complexity: O(m)
Deleting edges: Simply remove them from the graph. Time complexity: O(1)
Although inserting (u,v) takes O(m) time in the worst case, it is probably pretty fast in your situation. When doing the BFS/DFS starting from v to check whether u is reachable, you only visit vertices that are reachable from v. I would guess that in your setting the graph is pretty sparse and that the number of vertices reachable by another is not that high.
However, if you want to improve the theoretical running time, here are some hints (mostly showing that this will not be very easy). Assume we aim for testing in O(1) time whether there exists a directed path from v to u. The keyword in this context is the transitive closure of a DAG (i.e., a graph that contains an edge (u, v) if and only if there is a directed path from u to v in the DAG). Unfortunately, maintaining the transitive closure in a dynamic setting seems to be not that simple. There are several papers considering this problem and all papers I found were STOC or FOCS papers, which indicates that they are very involved. The newest (and fastest) result I found is in the paper Dynamic Transitive Closure via Dynamic Matrix Inverse by Sankowski (http://dl.acm.org/citation.cfm?id=1033207).
Even if you are willing to understand one of those dynamic transitive closure algorithms (or even want to implement it), they will not give you any speed up for the following reason. These algorithms are designed for the situation, where you have a lot of connectivity queries (which then can be performed in O(1) time) and only few changes in the graph. The goal then is to make these changes cheaper than recomputing the transitive closure. However, this update is still slower that a single check for connectivity. Thus, if you need to do an update on every connectivity query, it is better to use the simple approach mentioned above.
So why do I mention this approach of maintaining the transitive closure if it does not fit your needs? Well, it shows that searching an algorithm consuming only O(1) query time does probably not lead you to a solution faster than the simple one using BFS/DFS. What you could try is to get a query time that is faster than O(m) but worse than O(1), while updates are also faster than O(m). This is a very interesting problem, but it sounds to me like a very ambitious goal (so maybe do not spend too much time on trying to achieve it..).
As Mark suggested it is possible to use data structure that stores connected nodes. It is the best to use boolean matrix |V|x|V|. Values can be initialized with Floyd–Warshall algorithm. That is done in O(|V|^3).
Let T(i) be set of vertices that have path to vertex i, and F(j) set of vertices where exists path from vertex j. First are true's in i'th row and second true's in j'th column.
Adding an edge (i,j) is simple operation. If i and j wasn't connected before, than for each a from T(i) and each b from F(j) set matrix element (a,b) to true. But operation isn't cheap. In worst case it is O(|V|^2). That is in case of directed line, and adding edge from end to start vertex makes all vertices connected to all other vertices.
Removing an edge (i,j) is not so simple, but not more expensive operation in the worst case :-) If there is a path from i to j after removing edge, than nothing changes. That is checked with Dijkstra, less than O(|V|^2). Vertices that are not connected any more are (a,b):
a in T(i) - i - T(j),
b in F(j) + j
Only T(j) is changed with removing edge (i,j), so it has to be recalculated. That is done by any kind of graph traversing (BFS, DFS), by going in opposite edge direction from vertex j. That is done in less then O(|V|^2). Since setting of matrix element is in worst case is again O(|V|^2), this operation has same worst case complexity as adding edge.
This is a problem which I recently faced in a slightly different situation (optimal ordering of interdependent compiler instructions).
While I can't improve on O(n*n) theoretical bounds, after a fair amount of experimentation and assuming heuristics for my case (for example, assuming that the initial ordering wasn't created maliciously) the following was the best compromise algorithm in terms of performance.
(In my case I had an acceptable "right side failure": after the initial nodes and arcs were added (which was guaranteed to be possible), it was acceptable for the optimiser to occasionally reject the addition of further arcs where one could actually be added. This approximation isn't necessary for this algorithm when carried to completion, but it does admit such an approximation if you wish to do so, and so limiting its runtime further).
While a graph is topologically sorted, it is guaranteed to be cycle-free. In the first phase when I had a static bulk of nodes and arcs to add, I added the nodes and then topologically sorted them.
During the second phase, adding additional arcs, there are two situations when considering an arc from A to B. If A already lies to the left of B in the sort, an arc can simply be added and no cycle can be generated, as the list is still topologically sorted.
If B is to the left of A, we consider the sub-sequence between B and A and partition it into two disjoint sequences X, Y, where X is those nodes which can reach A (and Y the others). If A is not reachable from B, ie there are no direct arcs from B into X or to A, then the sequence can be reordered XABY before adding the A to B arc, showing it is still cycle-free and maintaining the topological sort. The efficiency over the naive algorithm here is that we only need consider the subsequence between B and A as our list is topologically sorted: A is not reachable from any node to the right of A. For my situation, where localised reorderings are the most frequent and important, this an important gain.
As we don't reorder within the sequences X,A,B,Y, clearly any arcs which start or end within the same sequence are still ordered correctly, and the same in each flank, and any "fly-over" arcs from the left to the right flanks. Any arcs between the flanks and X,A,B,Y are also still ordered correctly as our reordering is restricted to this local region. So we only need to consider arcs between our four sequences. Consider each possible "problematic" arc for our final ordering XABY in turn: YB YA YX BA BX AX. Our initial order was B[XY]A, so AX and YB cannot occur. X reaches A, but Y does not, therefore YX and YA do not occur or A could be reached from the source of the arc in Y (potentially via X) a contradiction. Our criterion for acceptability was that there are no links BX or BA. So there are no problematic arcs, and we are still topologically sorted.
Our only acceptability criterion (that A is not reachable from B) is clearly sufficient to create a cycle on adding the arc A->B: B -(X)-> A -> B, so the converse is also shown.
This can be implemented reasonably efficiently if we can add a flag to each node. Consider the nodes [BXY] going right-to-left from the node immediately to the left of A. If that node has a direct arc to A then set the flag. At an arbitrary such node, we need only consider direct outgoing arcs: the nodes to its right are either after A (and so irrelevant), or else have already been flagged if reachable from A, so the flag on such an arbitrary node is set when any flagged nodes are encountered by direct link. If B is not flagged at the end of the process, the reordering is acceptable and the flagged nodes comprise X.
Though this always yields a correct ordering if carried to completion (as far as I can tell), as I mentioned in the introduction it is particularly efficient if your initial build is approximately correct (in the sense of accommodating of likely additional arcs without reordering).
There also exists an effective approximation, if your context is such that "outrageous" arcs can be rejected (those which would massively reorder) by limiting the A to B distance you are prepared to scan. If you have an initial list of the additional arcs you wish to add, they can be ordered by increasing distance in the initial ordering until you run out of some scanning "credit", and call your optimisation a day at that point.
If the graph is directed, you would only have to check the parent nodes (navigate up until you reach the root) of the node where the new edge should start. If one of the parent nodes is equal to the end of the edge, adding the edge would create a cycle.
If all previous jobs are in Topologically sorted order. Then if you add an edge that appears to brake the sort, and can not be fixed, then you have a cycle.
https://stackoverflow.com/a/261621/831850
So if we have a sorted list of nodes:
1, 2, 3, ..., x, ..., z, ...
Such that each node is waiting for nodes to its left.
Say we want to add an edge from x->z. Well that appears to brake the sort. So we can move the node at x to position z+1 which will fix the sort iif none of the nodes (x, z] have an edge to the node at x.

Which algorithm should match this specific Graph

specific question here. Suppose you have a graph where each vertice specifies how many connections they must have to another vertices and the following rules/properties apply:
1- The graph can be incomplete (no need to every vertice to have a connection with every other)
2- There can be two connections between two vertices only if they are in opposite directions (e.g: A points do B, B points to A).
3- Suppose they are on a 2D plane, there can be no crossing of connections (not even tangents).
4- Theres no interest for the shortest path, just respecting the properties and knowing if the solution is unique or not.
5- There can be no possible solution
EDIT: Alright guys sorry for not being specific. I'll try to clarify my point here: what I want to do is given a number of vertices, know if a graph is connected (if all the points have at least a connection to the graph). The vertices given can be impossible to make a graph of it so I want to know if there's is a solution, if the solution is unique or not or (worst case scenario) if there is no possible solution. I think that clarifies point 4 and 5. The graph is undirected, the connections can Not curve, only straight lines.The Nodes (vertices) are fixed, we have their position from or W/E input. I wanted to know the best approach and I've been researching and it is a connectivity problem, though maybe some specific alg may be more efficient doing this task. That's all, sorry for late reply
EDIT2: Alright guys would the problem be different if we think that each vertice is on a row and column of a plane matrix and they can only connect with other Vertices on the same column or row? So it would be just 90/180/270/360 straight connections. This would hugely shorten the possibilities right?
I am going to assume that the question is: Given the degree of each vertex, work out a graph that passes all the constraints given.
I think you can reduce this to a very large integer programming problem - linear constraints, but with the variables required to be integers (in fact either 0 or 1), which makes the problem much more difficult than ordinary linear programming.
Let the unknowns be of the form Xij, where Xij is 1 if there is an edge from node i to node j, and 0 otherwise. The requirements on the number of connections then amount to requirements of the form SUM_{all i}Xij = K for some K dependent on the requirement. The requirement that the graph is planar reduces to the requirement that the graph not contain two known graphs as subgraphs - https://en.wikipedia.org/wiki/Graph_minor. Each possible subgraph then produces a constraint such as X01 + X02 + ... < 5 - there will be a huge number of these constraints - so large that for large number of nodes simply producing all the constraints may be too expensive to be practical, let alone solving them. The number of constraints goes up as at least the 6th power of the number of nodes. However this is polynomial, so theoretically practical to write down the MIP to be solved - so perhaps this is better than no algorithm at all.
Assuming that you are asking us to:
Find out if it is possible to generate one-or-more directed planar graphs such that each vertex has a given out-degree (not necessarily the same out-degree per vertex).
Let's also assume that you want the graph to be connected.
If there are n vertices and the vertices have degrees d_1 ... d_n then for vertex i there are C(n-1,d_i) = (n-1)!/((d_i)!*(n-1-d_i)!) possible combinations of out-edges from that vertex. Taking the product of all these combinations over all the vertices will give you the upper bound on the number of possible graphs.
The naive approach is:
Generate all possible graphs.
Filter the graphs to only have connected graphs.
Run a planarity test on the graph to determine if it is planar (you can consider the graph to be undirected in this step); discard if it isn't.
Profit!

Minimum spanning tree with two edges tied

I'd like to solve a harder version of the minimum spanning tree problem.
There are N vertices. Also there are 2M edges numbered by 1, 2, .., 2M. The graph is connected, undirected, and weighted. I'd like to choose some edges to make the graph still connected and make the total cost as small as possible. There is one restriction: an edge numbered by 2k and an edge numbered by 2k-1 are tied, so both should be chosen or both should not be chosen. So, if I want to choose edge 3, I must choose edge 4 too.
So, what is the minimum total cost to make the graph connected?
My thoughts:
Let's call two edges 2k and 2k+1 a edge set.
Let's call an edge valid if it merges two different components.
Let's call an edge set good if both of the edges are valid.
First add exactly m edge sets which are good in increasing order of cost. Then iterate all the edge sets in increasing order of cost, and add the set if at least one edge is valid. m should be iterated from 0 to M.
Run an kruskal algorithm with some variation: The cost of an edge e varies.
If an edge set which contains e is good, the cost is: (the cost of the edge set) / 2.
Otherwise, the cost is: (the cost of the edge set).
I cannot prove whether kruskal algorithm is correct even if the cost changes.
Sorry for the poor English, but I'd like to solve this problem. Is it NP-hard or something, or is there a good solution? :D Thanks to you in advance!
As I speculated earlier, this problem is NP-hard. I'm not sure about inapproximability; there's a very simple 2-approximation (split each pair in half, retaining the whole cost for both halves, and run your favorite vanilla MST algorithm).
Given an algorithm for this problem, we can solve the NP-hard Hamilton cycle problem as follows.
Let G = (V, E) be the instance of Hamilton cycle. Clone all of the other vertices, denoting the clone of vi by vi'. We duplicate each edge e = {vi, vj} (making a multigraph; we can do this reduction with simple graphs at the cost of clarity), and, letting v0 be an arbitrary original vertex, we pair one copy with {v0, vi'} and the other with {v0, vj'}.
No MST can use fewer than n pairs, one to connect each cloned vertex to v0. The interesting thing is that the other halves of the pairs of a candidate with n pairs like this can be interpreted as an oriented subgraph of G where each vertex has out-degree 1 (use the index in the cloned bit as the tail). This graph connects the original vertices if and only if it's a Hamilton cycle on them.
There are various ways to apply integer programming. Here's a simple one and a more complicated one. First we formulate a binary variable x_i for each i that is 1 if edge pair 2i-1, 2i is chosen. The problem template looks like
minimize sum_i w_i x_i (drop the w_i if the problem is unweighted)
subject to
<connectivity>
for all i, x_i in {0, 1}.
Of course I have left out the interesting constraints :). One way to enforce connectivity is to solve this formulation with no constraints at first, then examine the solution. If it's connected, then great -- we're done. Otherwise, find a set of vertices S such that there are no edges between S and its complement, and add a constraint
sum_{i such that x_i connects S with its complement} x_i >= 1
and repeat.
Another way is to generate constraints like this inside of the solver working on the linear relaxation of the integer program. Usually MIP libraries have a feature that allows this. The fractional problem has fractional connectivity, however, which means finding min cuts to check feasibility. I would expect this approach to be faster, but I must apologize as I don't have the energy to describe it detail.
I'm not sure if it's the best solution, but my first approach would be a search using backtracking:
Of all edge pairs, mark those that could be removed without disconnecting the graph.
Remove one of these sets and find the optimal solution for the remaining graph.
Put the pair back and remove the next one instead, find the best solution for that.
This works, but is slow and unelegant. It might be possible to rescue this approach though with a few adjustments that avoid unnecessary branches.
Firstly, the edge pairs that could still be removed is a set that only shrinks when going deeper. So, in the next recursion, you only need to check for those in the previous set of possibly removable edge pairs. Also, since the order in which you remove the edge pairs doesn't matter, you shouldn't consider any edge pairs that were already considered before.
Then, checking if two nodes are connected is expensive. If you cache the alternative route for an edge, you can check relatively quick whether that route still exists. If it doesn't, you have to run the expensive check, because even though that one route ceased to exist, there might still be others.
Then, some more pruning of the tree: Your set of removable edge pairs gives a lower bound to the weight that the optimal solution has. Further, any existing solution gives an upper bound to the optimal solution. If a set of removable edges doesn't even have a chance to find a better solution than the best one you had before, you can stop there and backtrack.
Lastly, be greedy. Using a regular greedy algorithm will not give you an optimal solution, but it will quickly raise the bar for any solution, making pruning more effective. Therefore, attempt to remove the edge pairs in the order of their weight loss.

Seeking a graph algorithm: how to make the graph happy?

Background:
As you can see below, there is an undirected graph on the left of the figure. Vertices are represented by S1,S2 ... S6, and edges are represented by line segments between vertices. Every edge has a weight (the number near the edge), either positive or negative.
Definitions:
In the graph, a simple cycle is called a conflicting cycle if it has an odd number of negative edges, and a concordant cycle if an even (or zero) number of negative edges. On the left of the figure below, for example, the graph has two conflicting cycles(S1-S2-S3-S1 and S2-S3-S4-S2), and other cycles are concordant. A graph is called happy if it has no conflicting cycle.
Objective:
Make the graph happy by removing some edges, meanwhile ensuring the cost(the sum of absolute values of weights of removed edges) is lowest. In the figure below, for example, after removing the edge (red line segment), there is no conflicting cycles. So the graph becomes happy, and the cost is only 2.
This problem is NP-hard by reduction from maximum cut. Given an instance of maximum cut, multiply all of the edge weights by -1. The constraints of this problem dictate that edges be removed so as to eliminate all odd cycles, i.e., we need to find the maximum-weight bipartite subgraph.
This problem in fact is equivalent to a 2-label unique label cover problem. The goal is to color each vertex black or white so as to minimize the sum of costs for (i) positive edges that connect vertices of different colors (ii) negative edges that connect vertices of the same color. Deleting all of these edges is a valid solution to the original problem. Conversely, given a valid set of edges to delete, we can determine a coloring. I expect that there's an approximation algorithm based on semidefinite programming (and the relaxation could be used for branch and bound).
Unless you're familiar with combinatorial optimization, however, the algorithm that I would suggest is integer programming. Let x(e) be 1 if we delete edge e and let x(e) be 0 if we don't.
minimize sum_{edges e} cost(e) x(e)
subject to
for every simple cycle C with an odd number of negative edges,
sum_{edges e in C} x(e) >= 1
for each edge e, x(e) in {0, 1}
The solver will do most of the work. The problem is handling the exponential number of constraints that I've written. The crudest thing to do is to generate all simple cycles and give the solver the whole program. Another possibility is to solve to optimality with a subset of the constraints, test whether the solution is actually valid, and, if not, introduce one or more missing constraints. To do the test, attempt to two-color the undeleted subgraph such that vertices joined by a positive edge have identical colors and vertices joined by a negative edge have different colors. Color greedily; if we get stuck, then there's an odd cycle at fault.
With more sophistication, it's possible to solve the program as written via a technique called column generation.
I've written a solver for this problem (under the name "Signed Graph Balancing"). It is based on a fixed-parameter algorithm that is fast if only few edges need to be deleted. The method is described in the paper "Separator-based data reduction for signed graph balancing".

How do I find the number of Hamiltonian cycles that do not use "forbidden" edges?

This question actually rephrases that one. The code jam problem is the following:
You are given a complete undirected graph with N nodes and K "forbidden" edges. N <= 300, K <= 15. Find the number of Hamiltonian cycles in the graph that do not use any of the K "forbidden" edges.
The straightforward DP approach of O(2^N*N^2) does not work for such N. It looks like that the winning solutions are O(2^K). Does anybody know how to solve this problem ?
Let's find out for each subset S of forbidden edges, how many Hamiltonian cycles there exist, that use all edges of S. If we solve this subtask then we'll solve the problem by inclusion-exclusion formula.
Now how do we solve the subtask? Let's draw all edges of S. If there exist a vertex of degree more than 2, then obviously we cannot complete the cycle and the answer is 0. Otherwise the graph is divided into connected components. Each component is a sole vertex, a cycle or a simple path.
If there exists a cycle, then it must pass through all vertices, otherwise we won't be able to complete the Hamiltonian cycle. If this is the case, the answer is 2. (The cycle can be traversed in 2 directions.) Otherwise the answer is 0.
The remaining case is when there are c paths and k sole vertices. To complete the Hamiltonian cycle we must choose the direction of each path (2^c ways) and then choose the order of components. We've got c+k components, so they can be rearranged in (c+k)! ways. But we are interested in cycles, so we don't distinguish the orderings which turn into one another by cyclic shifts. (So (1,2,3), (2,3,1) and (3,1,2) are the same.) It means that we must divide the answer by the number of shifts, c+k. So the answer (to the subtask) is 2^c (c+k-1)!.
This idea is implemented in bmerry's solution (very clean code, btw).
Hamiltonian cycle problem is a special case of travelling salesman problem (obtained by setting the distance between two cities to a finite constant if they are adjacent and infinity otherwise.)
These are NP Complete problems which in simple words means no fast solution to them is known.
A trivial heuristic algorithm for locating Hamiltonian paths is to construct a path abc... and extend it until no longer possible; when the path abc... xyz cannot be extended any longer because all neighbours of z already lie in the path, one goes back one step, removing the edge yz and extending the path with a different neighbour of y; if no choice produces a Hamiltonian path, then one takes a further step back, removing the edge xy and extending the path with a different neighbour of x, and so on. This algorithm will certainly find an Hamiltonian path (if any) but it runs in exponential time.
For more check NP Complete problem chapter of "Introduction to Algos" by Cormen

Resources