Related
I came upon wait-for graphs and I wonder, are there any efficient algorithms for detecting if adding an edge to a directed graph results in a cycle?
The graphs in question are mutable (they can have nodes and edges added or removed). And we're not interested in actually knowing an offending cycle, just knowing there is one is enough (to prevent adding an offending edge).
Of course it'd be possible to use an algorithm for computing strongly connected components (such as Tarjan's) to check if the new graph is acyclic or not, but running it again every time an edge is added seems quite inefficient.
If I understood your question correctly, then a new edge (u,v) is only inserted if there was no path from v to u before (i.e., if (u,v) does not create a cycle). Thus, your graph is always a DAG (directed acyclic graph). Using Tarjan's Algorithm to detect strongly connected components (http://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm) sounds like an overkill in this case. Before inserting (u,v), all you have to check is whether there is a directed path from v to u, which can be done with a simple BFS/DFS.
So the simplest way of doing it is the following (n = |V|, m = |E|):
Inserting (u,v): Check whether there is a path from v to u (BFS/DFS). Time complexity: O(m)
Deleting edges: Simply remove them from the graph. Time complexity: O(1)
Although inserting (u,v) takes O(m) time in the worst case, it is probably pretty fast in your situation. When doing the BFS/DFS starting from v to check whether u is reachable, you only visit vertices that are reachable from v. I would guess that in your setting the graph is pretty sparse and that the number of vertices reachable by another is not that high.
However, if you want to improve the theoretical running time, here are some hints (mostly showing that this will not be very easy). Assume we aim for testing in O(1) time whether there exists a directed path from v to u. The keyword in this context is the transitive closure of a DAG (i.e., a graph that contains an edge (u, v) if and only if there is a directed path from u to v in the DAG). Unfortunately, maintaining the transitive closure in a dynamic setting seems to be not that simple. There are several papers considering this problem and all papers I found were STOC or FOCS papers, which indicates that they are very involved. The newest (and fastest) result I found is in the paper Dynamic Transitive Closure via Dynamic Matrix Inverse by Sankowski (http://dl.acm.org/citation.cfm?id=1033207).
Even if you are willing to understand one of those dynamic transitive closure algorithms (or even want to implement it), they will not give you any speed up for the following reason. These algorithms are designed for the situation, where you have a lot of connectivity queries (which then can be performed in O(1) time) and only few changes in the graph. The goal then is to make these changes cheaper than recomputing the transitive closure. However, this update is still slower that a single check for connectivity. Thus, if you need to do an update on every connectivity query, it is better to use the simple approach mentioned above.
So why do I mention this approach of maintaining the transitive closure if it does not fit your needs? Well, it shows that searching an algorithm consuming only O(1) query time does probably not lead you to a solution faster than the simple one using BFS/DFS. What you could try is to get a query time that is faster than O(m) but worse than O(1), while updates are also faster than O(m). This is a very interesting problem, but it sounds to me like a very ambitious goal (so maybe do not spend too much time on trying to achieve it..).
As Mark suggested it is possible to use data structure that stores connected nodes. It is the best to use boolean matrix |V|x|V|. Values can be initialized with Floyd–Warshall algorithm. That is done in O(|V|^3).
Let T(i) be set of vertices that have path to vertex i, and F(j) set of vertices where exists path from vertex j. First are true's in i'th row and second true's in j'th column.
Adding an edge (i,j) is simple operation. If i and j wasn't connected before, than for each a from T(i) and each b from F(j) set matrix element (a,b) to true. But operation isn't cheap. In worst case it is O(|V|^2). That is in case of directed line, and adding edge from end to start vertex makes all vertices connected to all other vertices.
Removing an edge (i,j) is not so simple, but not more expensive operation in the worst case :-) If there is a path from i to j after removing edge, than nothing changes. That is checked with Dijkstra, less than O(|V|^2). Vertices that are not connected any more are (a,b):
a in T(i) - i - T(j),
b in F(j) + j
Only T(j) is changed with removing edge (i,j), so it has to be recalculated. That is done by any kind of graph traversing (BFS, DFS), by going in opposite edge direction from vertex j. That is done in less then O(|V|^2). Since setting of matrix element is in worst case is again O(|V|^2), this operation has same worst case complexity as adding edge.
This is a problem which I recently faced in a slightly different situation (optimal ordering of interdependent compiler instructions).
While I can't improve on O(n*n) theoretical bounds, after a fair amount of experimentation and assuming heuristics for my case (for example, assuming that the initial ordering wasn't created maliciously) the following was the best compromise algorithm in terms of performance.
(In my case I had an acceptable "right side failure": after the initial nodes and arcs were added (which was guaranteed to be possible), it was acceptable for the optimiser to occasionally reject the addition of further arcs where one could actually be added. This approximation isn't necessary for this algorithm when carried to completion, but it does admit such an approximation if you wish to do so, and so limiting its runtime further).
While a graph is topologically sorted, it is guaranteed to be cycle-free. In the first phase when I had a static bulk of nodes and arcs to add, I added the nodes and then topologically sorted them.
During the second phase, adding additional arcs, there are two situations when considering an arc from A to B. If A already lies to the left of B in the sort, an arc can simply be added and no cycle can be generated, as the list is still topologically sorted.
If B is to the left of A, we consider the sub-sequence between B and A and partition it into two disjoint sequences X, Y, where X is those nodes which can reach A (and Y the others). If A is not reachable from B, ie there are no direct arcs from B into X or to A, then the sequence can be reordered XABY before adding the A to B arc, showing it is still cycle-free and maintaining the topological sort. The efficiency over the naive algorithm here is that we only need consider the subsequence between B and A as our list is topologically sorted: A is not reachable from any node to the right of A. For my situation, where localised reorderings are the most frequent and important, this an important gain.
As we don't reorder within the sequences X,A,B,Y, clearly any arcs which start or end within the same sequence are still ordered correctly, and the same in each flank, and any "fly-over" arcs from the left to the right flanks. Any arcs between the flanks and X,A,B,Y are also still ordered correctly as our reordering is restricted to this local region. So we only need to consider arcs between our four sequences. Consider each possible "problematic" arc for our final ordering XABY in turn: YB YA YX BA BX AX. Our initial order was B[XY]A, so AX and YB cannot occur. X reaches A, but Y does not, therefore YX and YA do not occur or A could be reached from the source of the arc in Y (potentially via X) a contradiction. Our criterion for acceptability was that there are no links BX or BA. So there are no problematic arcs, and we are still topologically sorted.
Our only acceptability criterion (that A is not reachable from B) is clearly sufficient to create a cycle on adding the arc A->B: B -(X)-> A -> B, so the converse is also shown.
This can be implemented reasonably efficiently if we can add a flag to each node. Consider the nodes [BXY] going right-to-left from the node immediately to the left of A. If that node has a direct arc to A then set the flag. At an arbitrary such node, we need only consider direct outgoing arcs: the nodes to its right are either after A (and so irrelevant), or else have already been flagged if reachable from A, so the flag on such an arbitrary node is set when any flagged nodes are encountered by direct link. If B is not flagged at the end of the process, the reordering is acceptable and the flagged nodes comprise X.
Though this always yields a correct ordering if carried to completion (as far as I can tell), as I mentioned in the introduction it is particularly efficient if your initial build is approximately correct (in the sense of accommodating of likely additional arcs without reordering).
There also exists an effective approximation, if your context is such that "outrageous" arcs can be rejected (those which would massively reorder) by limiting the A to B distance you are prepared to scan. If you have an initial list of the additional arcs you wish to add, they can be ordered by increasing distance in the initial ordering until you run out of some scanning "credit", and call your optimisation a day at that point.
If the graph is directed, you would only have to check the parent nodes (navigate up until you reach the root) of the node where the new edge should start. If one of the parent nodes is equal to the end of the edge, adding the edge would create a cycle.
If all previous jobs are in Topologically sorted order. Then if you add an edge that appears to brake the sort, and can not be fixed, then you have a cycle.
https://stackoverflow.com/a/261621/831850
So if we have a sorted list of nodes:
1, 2, 3, ..., x, ..., z, ...
Such that each node is waiting for nodes to its left.
Say we want to add an edge from x->z. Well that appears to brake the sort. So we can move the node at x to position z+1 which will fix the sort iif none of the nodes (x, z] have an edge to the node at x.
I have an undirected graph. One edge in that graph is special. I want to find all other edges that are part of a even cycle containing the first edge.
I don't need to enumerate all the cycles, that would be inherently NP I think. I just need to know, for each each edge, whether it satisfies the conditions above.
A brute force search works of course but is too slow, and I'm struggling to come up with anything better. Any help appreciated.
I think we have an answer (I must credit my colleague with the idea). Essentially his idea is to do a flood fill algorithm through the space of even cycles. This works because if you have a large even cycle formed by merging two smaller cycles then the smaller cycles must have been both even or both odd. Similarly merging an odd and even cycle always forms a larger odd cycle.
This is a practical option only because I can imagine pathological cases consisting of alternating even and odd cycles. In this case we would never find two adjacent even cycles and so the algorithm would be slow. But I'm confident that such cases don't arise in real chemistry. At least in chemistry as it's currently known, 30 years ago we'd never heard of fullerenes.
If your graph has a small node degree, you might consider using a different graph representation:
Let three atoms u,v,w and two chemical bonds e=(u,v) and k=(v,w). A typical way of representing such data is to store u,v,w as nodes and e,k as edges in a graph.
However, one may represent e and k as nodes in the graph, having edges like f=(e,k) where f represents a 2-step link from u to w, f=(e,k) or f=(u,v,w). Running any algorithm to find cycles on such a graph will return all even cycles on the original graph.
Of course, this is efficient only if the original graph has a small node degree. When a user performs an edit, you can easily edit accordingly the alternative representation.
In a tower defense game, you have an NxM grid with a start, a finish, and a number of walls.
Enemies take the shortest path from start to finish without passing through any walls (they aren't usually constrained to the grid, but for simplicity's sake let's say they are. In either case, they can't move through diagonal "holes")
The problem (for this question at least) is to place up to K additional walls to maximize the path the enemies have to take. For example, for K=14
My intuition tells me this problem is NP-hard if (as I'm hoping to do) we generalize this to include waypoints that must be visited before moving to the finish, and possibly also without waypoints.
But, are there any decent heuristics out there for near-optimal solutions?
[Edit] I have posted a related question here.
I present a greedy approach and it's maybe close to the optimal (but I couldn't find approximation factor). Idea is simple, we should block the cells which are in critical places of the Maze. These places can help to measure the connectivity of maze. We can consider the vertex connectivity and we find minimum vertex cut which disconnects the start and final: (s,f). After that we remove some critical cells.
To turn it to the graph, take dual of maze. Find minimum (s,f) vertex cut on this graph. Then we examine each vertex in this cut. We remove a vertex its deletion increases the length of all s,f paths or if it is in the minimum length path from s to f. After eliminating a vertex, recursively repeat the above process for k time.
But there is an issue with this, this is when we remove a vertex which cuts any path from s to f. To prevent this we can weight cutting node as high as possible, means first compute minimum (s,f) cut, if cut result is just one node, make it weighted and set a high weight like n^3 to that vertex, now again compute the minimum s,f cut, single cutting vertex in previous calculation doesn't belong to new cut because of waiting.
But if there is just one path between s,f (after some iterations) we can't improve it. In this case we can use normal greedy algorithms like removing node from a one of a shortest path from s to f which doesn't belong to any cut. after that we can deal with minimum vertex cut.
The algorithm running time in each step is:
min-cut + path finding for all nodes in min-cut
O(min cut) + O(n^2)*O(number of nodes in min-cut)
And because number of nodes in min cut can not be greater than O(n^2) in very pessimistic situation the algorithm is O(kn^4), but normally it shouldn't take more than O(kn^3), because normally min-cut algorithm dominates path finding, also normally path finding doesn't takes O(n^2).
I guess the greedy choice is a good start point for simulated annealing type algorithms.
P.S: minimum vertex cut is similar to minimum edge cut, and similar approach like max-flow/min-cut can be applied on minimum vertex cut, just assume each vertex as two vertex, one Vi, one Vo, means input and outputs, also converting undirected graph to directed one is not hard.
it can be easily shown (proof let as an exercise to the reader) that it is enough to search for the solution so that every one of the K blockades is put on the current minimum-length route. Note that if there are multiple minimal-length routes then all of them have to be considered. The reason is that if you don't put any of the remaining blockades on the current minimum-length route then it does not change; hence you can put the first available blockade on it immediately during search. This speeds up even a brute-force search.
But there are more optimizations. You can also always decide that you put the next blockade so that it becomes the FIRST blockade on the current minimum-length route, i.e. you work so that if you place the blockade on the 10th square on the route, then you mark the squares 1..9 as "permanently open" until you backtrack. This saves again an exponential number of squares to search for during backtracking search.
You can then apply heuristics to cut down the search space or to reorder it, e.g. first try those blockade placements that increase the length of the current minimum-length route the most. You can then run the backtracking algorithm for a limited amount of real-time and pick the best solution found thus far.
I believe we can reduce the contained maximum manifold problem to boolean satisifiability and show NP-completeness through any dependency on this subproblem. Because of this, the algorithms spinning_plate provided are reasonable as heuristics, precomputing and machine learning is reasonable, and the trick becomes finding the best heuristic solution if we wish to blunder forward here.
Consider a board like the following:
..S........
#.#..#..###
...........
...........
..........F
This has many of the problems that cause greedy and gate-bound solutions to fail. If we look at that second row:
#.#..#..###
Our logic gates are, in 0-based 2D array ordered as [row][column]:
[1][4], [1][5], [1][6], [1][7], [1][8]
We can re-render this as an equation to satisfy the block:
if ([1][9] AND ([1][10] AND [1][11]) AND ([1][12] AND [1][13]):
traversal_cost = INFINITY; longest = False # Infinity does not qualify
Excepting infinity as an unsatisfiable case, we backtrack and rerender this as:
if ([1][14] AND ([1][15] AND [1][16]) AND [1][17]:
traversal_cost = 6; longest = True
And our hidden boolean relationship falls amongst all of these gates. You can also show that geometric proofs can't fractalize recursively, because we can always create a wall that's exactly N-1 width or height long, and this represents a critical part of the solution in all cases (therefore, divide and conquer won't help you).
Furthermore, because perturbations across different rows are significant:
..S........
#.#........
...#..#....
.......#..#
..........F
We can show that, without a complete set of computable geometric identities, the complete search space reduces itself to N-SAT.
By extension, we can also show that this is trivial to verify and non-polynomial to solve as the number of gates approaches infinity. Unsurprisingly, this is why tower defense games remain so fun for humans to play. Obviously, a more rigorous proof is desirable, but this is a skeletal start.
Do note that you can significantly reduce the n term in your n-choose-k relation. Because we can recursively show that each perturbation must lie on the critical path, and because the critical path is always computable in O(V+E) time (with a few optimizations to speed things up for each perturbation), you can significantly reduce your search space at a cost of a breadth-first search for each additional tower added to the board.
Because we may tolerably assume O(n^k) for a deterministic solution, a heuristical approach is reasonable. My advice thus falls somewhere between spinning_plate's answer and Soravux's, with an eye towards machine learning techniques applicable to the problem.
The 0th solution: Use a tolerable but suboptimal AI, in which spinning_plate provided two usable algorithms. Indeed, these approximate how many naive players approach the game, and this should be sufficient for simple play, albeit with a high degree of exploitability.
The 1st-order solution: Use a database. Given the problem formulation, you haven't quite demonstrated the need to compute the optimal solution on the fly. Therefore, if we relax the constraint of approaching a random board with no information, we can simply precompute the optimum for all K tolerable for each board. Obviously, this only works for a small number of boards: with V! potential board states for each configuration, we cannot tolerably precompute all optimums as V becomes very large.
The 2nd-order solution: Use a machine-learning step. Promote each step as you close a gap that results in a very high traversal cost, running until your algorithm converges or no more optimal solution can be found than greedy. A plethora of algorithms are applicable here, so I recommend chasing the classics and the literature for selecting the correct one that works within the constraints of your program.
The best heuristic may be a simple heat map generated by a locally state-aware, recursive depth-first traversal, sorting the results by most to least commonly traversed after the O(V^2) traversal. Proceeding through this output greedily identifies all bottlenecks, and doing so without making pathing impossible is entirely possible (checking this is O(V+E)).
Putting it all together, I'd try an intersection of these approaches, combining the heat map and critical path identities. I'd assume there's enough here to come up with a good, functional geometric proof that satisfies all of the constraints of the problem.
At the risk of stating the obvious, here's one algorithm
1) Find the shortest path
2) Test blocking everything node on that path and see which one results in the longest path
3) Repeat K times
Naively, this will take O(K*(V+ E log E)^2) but you could with some little work improve 2 by only recalculating partial paths.
As you mention, simply trying to break the path is difficult because if most breaks simply add a length of 1 (or 2), its hard to find the choke points that lead to big gains.
If you take the minimum vertex cut between the start and the end, you will find the choke points for the entire graph. One possible algorithm is this
1) Find the shortest path
2) Find the min-cut of the whole graph
3) Find the maximal contiguous node set that intersects one point on the path, block those.
4) Wash, rinse, repeat
3) is the big part and why this algorithm may perform badly, too. You could also try
the smallest node set that connects with other existing blocks.
finding all groupings of contiguous verticies in the vertex cut, testing each of them for the longest path a la the first algorithm
The last one is what might be most promising
If you find a min vertex cut on the whole graph, you're going to find the choke points for the whole graph.
Here is a thought. In your grid, group adjacent walls into islands and treat every island as a graph node. Distance between nodes is the minimal number of walls that is needed to connect them (to block the enemy).
In that case you can start maximizing the path length by blocking the most cheap arcs.
I have no idea if this would work, because you could make new islands using your points. but it could help work out where to put walls.
I suggest using a modified breadth first search with a K-length priority queue tracking the best K paths between each island.
i would, for every island of connected walls, pretend that it is a light. (a special light that can only send out horizontal and vertical rays of light)
Use ray-tracing to see which other islands the light can hit
say Island1 (i1) hits i2,i3,i4,i5 but doesn't hit i6,i7..
then you would have line(i1,i2), line(i1,i3), line(i1,i4) and line(i1,i5)
Mark the distance of all grid points to be infinity. Set the start point as 0.
Now use breadth first search from the start. Every grid point, mark the distance of that grid point to be the minimum distance of its neighbors.
But.. here is the catch..
every time you get to a grid-point that is on a line() between two islands, Instead of recording the distance as the minimum of its neighbors, you need to make it a priority queue of length K. And record the K shortest paths to that line() from any of the other line()s
This priority queque then stays the same until you get to the next line(), where it aggregates all priority ques going into that point.
You haven't showed the need for this algorithm to be realtime, but I may be wrong about this premice. You could then precalculate the block positions.
If you can do this beforehand and then simply make the AI build the maze rock by rock as if it was a kind of tree, you could use genetic algorithms to ease up your need for heuristics. You would need to load any kind of genetic algorithm framework, start with a population of non-movable blocks (your map) and randomly-placed movable blocks (blocks that the AI would place). Then, you evolve the population by making crossovers and transmutations over movable blocks and then evaluate the individuals by giving more reward to the longest path calculated. You would then simply have to write a resource efficient path-calculator without the need of having heuristics in your code. In your last generation of your evolution, you would take the highest-ranking individual, which would be your solution, thus your desired block pattern for this map.
Genetic algorithms are proven to take you, under ideal situation, to a local maxima (or minima) in reasonable time, which may be impossible to reach with analytic solutions on a sufficiently large data set (ie. big enough map in your situation).
You haven't stated the language in which you are going to develop this algorithm, so I can't propose frameworks that may perfectly suit your needs.
Note that if your map is dynamic, meaning that the map may change over tower defense iterations, you may want to avoid this technique since it may be too intensive to re-evolve an entire new population every wave.
I'm not at all an algorithms expert, but looking at the grid makes me wonder if Conway's game of life might somehow be useful for this. With a reasonable initial seed and well-chosen rules about birth and death of towers, you could try many seeds and subsequent generations thereof in a short period of time.
You already have a measure of fitness in the length of the creeps' path, so you could pick the best one accordingly. I don't know how well (if at all) it would approximate the best path, but it would be an interesting thing to use in a solution.
Is there an algorithm that can check, in a directed graph, if a vertex, let's say V2, is reachable from a vertex V1, without traversing all the vertices?
You might find a route to that node without traversing all the edges, and if so you can give a yes answer as soon as you do. Nothing short of traversing all the edges can confirm that the node isn't reachable (unless there's some other constraint you haven't stated that could be used to eliminate the possibility earlier).
Edit: I should add that it depends on how often you need to do queries versus how large (and dense) your graph is. If you need to do a huge number of queries on a relatively small graph, it may make sense to pre-process the data in the graph to produce a matrix with a bit at the intersection of any V1 and V2 to indicate whether there's a connection from V1 to V2. This doesn't avoid traversing the graph, but it can avoid traversing the graph at the time of the query. I.e., it's basically a greedy algorithm that assumes you're going to eventually use enough of the combinations that it's easiest to just traverse them all and store the result. Depending on the size of the graph, the pre-processing step may be slow, but once it's done executing a query becomes quite fast (constant time, and usually a pretty small constant at that).
Depth first search or breadth first search. Stop when you find one. But there's no way to tell there's none without going through every one, no. You can improve the performance sometimes with some heuristics, like if you have additional information about the graph. For example, if the graph represents a coordinate space like a real map, and most of the time you know that there's going to be a mostly direct path, then you can attempt to have the depth-first search look along lines that "aim towards the target". However, imagine the case where the start and end points are right next to each other, but with no vector inbetween, and to find it, you have to go way out of the way. You have to check every case in order to be exhaustive.
I doubt it has a name, but a breadth-first search might go like this:
Add V1 to a queue of nodes to be visited
While there are nodes in the queue:
If the node is V2, return true
Mark the node as visited
For every node at the end of an outgoing edge which is not yet visited:
Add this node to the queue
End for
End while
Return false
Create an adjacency matrix when the graph is created. At the same time you do this, create matrices consisting of the powers of the adjacency matrix up to the number of nodes in the graph. To find if there is a path from node u to node v, check the matrices (starting from M^1 and going to M^n) and examine the value at (u, v) in each matrix. If, for any of the matrices checked, that value is greater than zero, you can stop the check because there is indeed a connection. (This gives you even more information as well: the power tells you the number of steps between nodes, and the value tells you how many paths there are between nodes for that step number.)
(Note that if you know the number of steps in the longest path in your graph, for whatever reason, you only need to create a number of matrices up to that power. As well, if you want to save memory, you could just store the base adjacency matrix and create the others as you go along, but for large matrices that may take a fair amount of time if you aren't using an efficient method of doing the multiplications, whether from a library or written on your own.)
It would probably be easiest to just do a depth- or breadth-first search, though, as others have suggested, not only because they're comparatively easy to implement but also because you can generate the path between nodes as you go along. (Technically you'd be generating multiple paths and discarding loops/dead-end ones along the way, but whatever.)
In principle, you can't determine that a path exists without traversing some part of the graph, because the failure case (a path does not exist) cannot be determined without traversing the entire graph.
You MAY be able to improve your performance by searching backwards (search from destination to starting point), or by alternating between forward and backward search steps.
Any good AI textbook will talk at length about search techniques. Elaine Rich's book was good in this area. Amazon is your FRIEND.
You mentioned here that the graph represents a road network. If the graph is planar, you could use Thorup's Algorithm which creates an O(nlogn) space data structure that takes O(nlogn) time to build and answers queries in O(1) time.
Another approach to this problem would allow you to ignore all of the vertices. If you were to only look at the edges, you can produce a transitive closure array that will show you each vertex that is reachable from any other vertex.
Start with your list of edges:
Va -> Vc
Va -> Vd
....
Create an array with start location as the rows and end location as the columns. Fill the arrays with 0. For each edge in the list of edges, place a one in the start,end coordinate of the edge.
Now you iterate a few times until either V1,V2 is 1 or there are no changes.
For each row:
NextRowN = RowN
For each column that is true for RowN
Use boolean OR to OR in the results of that row of that number with the current NextRowN.
Set RowN to NextRowN
If you run this algorithm until the end, you will quickly have a complete list of all reachable vertices without looking at any of them. The runtime is proportional to the number of edges. This would work well with a reasonable implementation and a reasonable number of edges.
A slightly more complex version of this algorithm would be to only calculate the vertices reachable by V1. To do this, you would focus your scope on the ones that are currently reachable at any given time. You can also limit adding rows to only one time, since the other rows are never changing.
In order to be sure, you either have to find a path, or traverse all vertices that are reachable from V1 once.
I would recommend an implementation of depth first or breadth first search that stops when it encounters a vertex that it has already seen. The vertex will be processed on the first occurrence only. You need to make sure that the search starts at V1 and stops when it runs out of vertices or encounters V2.
The minimum spanning tree problem is to take a connected weighted graph and find the subset of its edges with the lowest total weight while keeping the graph connected (and as a consequence resulting in an acyclic graph).
The algorithm I am considering is:
Find all cycles.
remove the largest edge from each cycle.
The impetus for this version is an environment that is restricted to "rule satisfaction" without any iterative constructs. It might also be applicable to insanely parallel hardware (i.e. a system where you expect to have several times more degrees of parallelism then cycles).
Edits:
The above is done in a stateless manner (all edges that are not the largest edge in any cycle are selected/kept/ignored, all others are removed).
What happens if two cycles overlap? Which one has its longest edge removed first? Does it matter if the longest edge of each is shared between the two cycles or not?
For example:
V = { a, b, c, d }
E = { (a,b,1), (b,c,2), (c,a,4), (b,d,9), (d,a,3) }
There's an a -> b -> c -> a cycle, and an a -> b -> d -> a
#shrughes.blogspot.com:
I don't know about removing all but two - I've been sketching out various runs of the algorithm and assuming that parallel runs may remove an edge more than once I can't find a situation where I'm left without a spanning tree. Whether or not it's minimal I don't know.
For this to work, you'd have to detail how you would want to find all cycles, apparently without any iterative constructs, because that is a non-trivial task. I'm not sure that's possible. If you really want to find a MST algorithm that doesn't use iterative constructs, take a look at Prim's or Kruskal's algorithm and see if you could modify those to suit your needs.
Also, is recursion barred in this theoretical architecture? If so, it might actually be impossible to find a MST on a graph, because you'd have no means whatsoever of inspecting every vertex/edge on the graph.
I dunno if it works, but no matter what your algorithm is not even worth implementing. Finding all cycles will be the freaking huge bottleneck that will kill it. Also doing that without iterations is impossible. Why don't you implement some standard algorithm, let's say Prim's.
Your algorithm isn't quite clearly defined. If you have a complete graph, your algorithm would seem to entail, in the first step, removing all but the two minimum elements. Also, listing all the cycles in a graph can take exponential time.
Elaboration:
In a graph with n nodes and an edge between every pair of nodes, there are, if I have my math right, n!/(2k(n-k)!) cycles of size k, if you're counting a cycle as some subgraph of k nodes and k edges with each node having degree 2.
#Tynan The system can be described (somewhat over simplified) as a systems of rules describing categorizations. "Things are in category A if they are in B but not in C", "Nodes connected to nodes in Z are also in Z", "Every category in M is connected to a node N and has 'child' categories, also in M for every node connected to N". It's slightly more complicated than this. (I have shown that by creating unstable rules you can model a turning machine but that's beside the point.) It can't explicitly define iteration or recursion but can operate on recursive data with rules like the 2nd and 3rd ones.
#Marcin, Assume that there are an unlimited number of processors. It is trivial to show that the program can be run in O(n^2) for n being the longest cycle. With better data structures, this can be reduced to O(n*O(set lookup function)), I can envision hardware (quantum computers?) that can evaluate all cycles in constant time. giving a O(1) solution to the MST problem.
The Reverse-delete algorithm seems to provide a partial proof of correctness (that the proposed algorithm will not produce a non-minimal spanning tree) this is derived by arguing that mt algorithm will remove every edge that the Reverse-delete algorithm will. However I'm not sure how to show that my algorithm won't delete more than that algorithm.
Hhmm....
OK this is an attempt to finish the proof of correctness. By analogy to the Reverse-delete algorithm, we know that enough edges will be removed. What remains is to show that there will not be to many edges removed.
Removing to many edges can be described as removing all the edges between the side of a binary partition of the graph nodes. However only edges in a cycle are ever removed, therefor, for all edge between partitions to be removed, there needs to be a return path to complete the cycle. If we only consider edges between the partitions then the algorithm can at most remove the larger of each pair of edges, this can never remove the smallest bridging edge. Therefor for any arbitrary binary partitioning, the algorithm can't sever all links between the side.
What remains is to show that this extends to >2 way partitions.