Implementing Kruskal's algorithm in Ada, not sure where to start - algorithm

With reference to Kruskal's algorithm in Ada, I'm not sure where to start.
I'm trying to think through everything before I actually write the program, but am pretty lost as to what data structures I should be using and how to represent everything.
My original thought is to represent the full tree in an adjacency list, but reading Wikipedia the algorithm states to create a forest F (a set of trees), where each vertex in the graph is a separate tree and I'm not sure how to implement this without getting really messy quickly.
The next thing it says to do is create a set S containing all the edges in the graph, but once again I'm not sure what the best way to do this would be. I was thinking of an array of records, with a to, from and weight, but I'm lost on the forest.
Lastly, I'm trying to figure out how I would know if an edge connects two trees, but again am not sure what the best way to do all of this is.

I can see where their algorithm description would leave you confused as how to start. It left me the same way.
I'd suggest reading over the later Example section instead. That makes it pretty clear how to proceed, and you can probably come up with the data structures you would need to do it just from that.
It looks like the basic idea is the following:
Take the graph, find the shortest edge that introduces at least one new vertex, and put it in your "spanning tree".
Repeat the step above until you have every vertex.

The "create a forest part" really means: implement the pseudocode from the page Disjoint-set data structure. If you can read C++, then I have a pretty straightforward implementation here. (That implementation works, I've used it to implement Kruskal's algo myself :)

Related

Find all cycles in a directed Graph Golang

I'm trying to generate all the cycles contained in a directed graph using Golang (or at least a few).
I currently have two structs :
Node : { ID (string), resolved (bool), edges ([]Edge) }
Edge : { ID (string), start (Node), end (Node), weight (Float64)}
The cycle weight is not an issue (for the moment).
I've found some answers regarding how to detect cycles, or find shortest path etc. but i didn't find an algorithm that can quite help me.
How shall I proceed? (any suggestion is welcome)
There are two parts to the question.
Regarding algorithms to detect all cycles in a graph, take a look at this related question (since this is not go-specific), there are useful explanations and pseudo-code that you can use to implement your solution.
Finding all cycles in a directed graph
As per specific go code, there are several libraries out there that work with graphs, you can take a look at their documentation and source code (they might even provide functionality that you can use out-of-the-box to solve your problem).
For example: https://godoc.org/github.com/twmb/algoimpl/go/graph
I would suggest starting with defining what a cycle is - for example let's suppose it is a traversal through the graph that starts and ends in the same node.
To enumerate all cycles with this definition, you'll need to consider all paths starting from all nodes, and check if any of those paths go back to their start point.
However, observe that this definition can actually count each cyclical subgraph many times - any node along a cyclical path - is that one cycle or several? And things get even more complicated if the paths of several cycles intersect, the number of cyclical paths increases drastically, and it's not very clear which cycles are "the same".
I hope it's easy to see that a brute force approach is intractable for anything but very small and simple graphs, and that something concerned with say minimal cycles or even just identifying cyclic subgraphs is enough for your purposes.
As already mentioned by #eugenioy, this has been asked before and you can probably narrow down your question by looking at the answers in that thread.
So, depending on what you mean by "all" and what you mean by "cycles", you can probably find an algorithm that defines cycles in the same way that you are interested in, and, and ask a more focused question if you're having trouble translating it to Go, which I don't think your question is really about at the moment.

What data structure to use for digraph paths?

I'm trying to represent a transitive relation (in a database) and having a hard time working out the best data structure.
Basically, the data structure is a series of pairs A → B such that if A → B and B → C, then implicitly A → C. It's important to me to be able to identify which entries are original input and which entries exist implicitly. Asking if A → C is equivalent to me having a digraph and asking if there exists a path from A to C in that digraph.
I could just represent the original entries, but if I do than then it takes a lot of time to determine if two items are related, since I need to search for all possible paths and this is rather slow.
Alternatively, I can store the original edges, as well as a listing of all paths. This makes adding a new edge easy, because when I add A → B I can just take the Cartesian product of paths ending in A and the paths ending in B and put them together. This has some significant space overhead of O(n2) in the worst case, but has the nice property that lookups, by far the most common operation, will be constant time. The issue is deleting, where I cannot think of anything really other than recalculating all paths that may or may not run through the edge deleted, and this can be really nasty.
Does anyone have any better ideas?
Technical notes: the digraph may be cyclic, but the relation is reflexive so I don't need to represent the reflexivity or store anything about it.
This is called the Reachability problem.
It would seem that you want an efficient online algorithm, which is an open problem, and an area of much research.
See my similar question on cs.SE: An incrementally-condensed transitive-reduction of a DAG, with efficient reachability queries, where I reference several related querstions across stackexchange:
Related:
What is the fastest deterministic algorithm for dynamic digraph reachability with no edge deletion?
What is the fastest deterministic algorithm for incremental DAG reachability?
Does an algorithm exist to efficiently maintain connectedness information for a DAG in presence of inserts/deletes?
Is there an online-algorithm to keep track of components in a changing undirected graph?
Dynamic shortest path data structure for DAG
Note that even though some algorithm might be for a DAG only, if it supports condensation (that is, collapsing strongly connected components into one node, since they are considered equal, ie. they relate back and forth), it is equivalent; after condensation, you can query the graph for the representative node in place of any of the condensed nodes (because they were both reachable from each-other, and thusly related to the rest of the graph in exactly the same way).
My conclusion is that as-of-yet there does not seem to be an efficient way to do this (on the order of O(log n) queries for a dynamic graph, with output-sensitive update times on the condensed graph). For less efficient ways, see the related links above.
The closest practical algorithm I found was here (source), which is an interesting read. I am not sure how easy/practical this data-structure or any data structure in any paper you will find, would be to adapt it to a database.
PS. Consider asking CS-related questions on cs.stackexchange.com in the future.

Fastest path to walk over all given nodes

I'm coding a simple game and currently doing the AI part. NPC gets a list of his 'interest points' which he needs to visit. Each point has a coordinate on the map. I need to find a fastest path for the character to visit all of the given points.
As far as I understand it, the task could be described as 'finding fastest traverse path in a strongly connected weighted undirected graph'.
I'd like to get either the name of some algorithm to calculate that or if there is no name - some keypoints on programming it myself.
Thanks in advance.
This is very similar to the Travelling Salesman problem, although I'm not going to try to prove equivalency offhand. The TSP is NP-complete, which means that solving the problem exactly may be impractical, depending on the number of interest points. There are approximation algorithms that you may find more useful.
See previous post regarding tree traversals:
Tree traversal algorithm for directory structures with a lot of files
I would use algorithm like: ant algorithm.
Not directly on point but what I did in an MMO emulator was to store waypoint indices along with the rest of the pathing data. If your requirement is to demonstrate solutions to TSP then ignore this. If not, it's worth consideration IMO.
In my case it was the best solution as otherwise the server could have potentially hundreds of mobs (re)spawning and along with all the other AI logic, would have to burn cycles computing route logic.

Algorithm to computer the optimal layout of n-ary tree?

I am looking for an algorithm that will automatically arrange all the nodes in an n-tree so that no nodes overlap, and not too much space is wasted. The user will be able to add nodes at runtime and the tree must auto arrange itself. Also note it is possible that the tree's could get fairly large ( a few thousand nodes ).
The algorithm has to work in real time, meaning the user cannot notice any pausing.
I have tried Google but I haven't found any substantial resources, any help is appreciated!
I took a look at this problem a while back and decided ultimately to change my goals from a Directed acyclic graph (DAG) to a general graph only due to complexities of what I encountered.
That being said, have you looked at the Sugiyama algorithm for graph layout?
If you're not looking to roll your own, I came across yFiles that did the job quite nicely (a bit on the pricy side though, so I did end up doing exactly that - rolling my own).

Efficient way to recursively calculate dominator tree?

I'm using the Lengauer and Tarjan algorithm with path compression to calculate the dominator tree for a graph where there are millions of nodes. The algorithm is quite complex and I have to admit I haven't taken the time to fully understand it, I'm just using it. Now I have a need to calculate the dominator trees of the direct children of the root node and possibly recurse down the graph to a certain depth repeating this operation. I.e. when I calculate the dominator tree for a child of the root node I want to pretend that the root node has been removed from the graph.
My question is whether there is an efficient solution to this that makes use of immediate dominator information already calculated in the initial dominator tree for the root node? In other words I don't want to start from scratch for each of the children because the whole process is quite time consuming.
Naively it seems it must be possible since there will be plenty of nodes deep down in the graph that have idoms just a little way above them and are unaffected by changes at the top of the graph.
BTW just as aside: it's bizarre that the subject of dominator trees is "owned" by compiler people and there is no mention of it in books on classic graph theory. The application I'm using it for - my FindRoots java heap analyzer - is not related to compiler theory.
Clarification: I'm talking about directed graphs here. The "root" I refer to is actually the node with the greatest reachability. I've updated the text above replacing references to "tree" with "graph". I tend to think of them as trees because the shape is mainly tree-like. The graph is actually of the objects in a java heap and as you can imagine is reasonably hierarchical. I have found the dominator tree useful when doing OOM leak analysis because what you are interested in is "what keeps this object alive?" and the answer ultimately is its dominator. Dominator trees allow you to <ahem> see the wood rather than the trees. But sometimes lots of junk floats to the top of the tree so you have a root with thousands of children directly below it. For such cases I would like to experiment with calculating the dominator trees rooted at each of the direct children (in the original graph) of the root and then maybe go to the next level down and so on. (I'm trying not to worry about the possibility of back links for the time being :)
boost::lengauer_tarjan_dominator_tree_without_dfs might help.
Judging by the lack of comments, I guess there aren't many people on Stackoverflow with the relevent experience to help you. I'm one of those people, but I don't want such an interesting question go down with with a dull thud so I'll try and lend a hand.
My first thought is that if this graph is generated by other compilers would it be worth taking a look at an open-source compiler, like GCC, to see how it solves this problem?
My second thought is that, the main point of your question appears to be avoiding recomputing the result for the root of the tree.
What I would do is create a wrapper around each node that contains the node itself and any pre-computed data associated with that node. A new tree would then be reconstructed from the old tree recursively using these wrapper classes. As you're constructing this tree, you'd start at the root and work your way out to the leaf nodes. For each node, you'd store the result of the computation for all the ancestory thus far. That way, you should only ever have to look at the parent node and the current node data you're processing to compute the value for your new node.
I hope that helps!
Could you elaborate on what sort of graph you're starting with? I don't see how there is any difference between a graph which is a tree, and the dominator tree of that graph. Every node's parent should be its idom, and it would of course be dominated by everything above it in the tree.
I do not fully understand your question, but it seems to me you want to have some incremental update feature. I researched a while ago what algorithms are their but it seemed to me that there's no known way for large graphs to do this quickly (at least from a theoretical standpoint).
You may just search for "incremental updates dominator tree" to find some references.
I guess you are aware the Eclipse Memory Analyzer does use dominator trees, so this topic is not completely "owned" by the compiler community anymore :)

Resources