I look for the computer science and mathematical definition of a category hierarchy.
As far as I could find, from a computer science poit of view, a categry hierarchy is a tree data structure where there is a hierchical order between a parent and its children. Is this definition exact and complete?
Also, from a mathematical (graph theory) point of view, a category hierarchy is a rooted tree and more preciesly an out-tree? Is there a more precise and complete mathematical definition?
Related
There are many problems in which we need to find the parents or ancestors of a node in a tree repeatedly. So, in those scenarios, instead of finding the parent node at run-time, a less complicated approach seems to be using parent pointers. This is time efficient but increase space. Can anyone suggest, in which kind of problems or scenarios, it is advisable to use parent pointers in a tree?
For example - distance between two nodes of a tree?
using parent pointers. This is time efficient but increase space.
A classic trade-off in Computer Science.
In which kind of problems or scenarios, it is advisable to use parent pointers in a tree?
In cases where finding the parents in runtime would cost much more than having pointers to the parents.
Now, one has to understand what cost means. You mentioned the trade-off yourself: One should think whether or not is worth to spend some extra memory to store the pointers, in order speedup your program.
Here are some of the scenarios that I can think of, where having a parent pointer saved in a node could help improve out time complexity
-> Ancestors of a given node in a binary tree
-> Union Find Algorithm
-> Maintain collection of disjoint sets
-> Merge two sets together
Now according to me in general having a parent pointer for any kind of tree problem or trie problem would make your traversal up-down or bottom-up easier.
Hope this helps!
Just cases where you need efficient bottom-up traversal outside the context of top-to-bottom traversal as a generalized answer.
As a concrete example, let's say you have a graphics software which uses a quad-tree to efficiently draw only elements on screen and let users select elements efficiently that they click on or marquee select.
However, after the users select some elements, they can then delete them. Deleting those elements would require the quad-tree to be updated in a bottom-up sort of fashion, updating parent nodes in response to leaf nodes becoming empty. But the elements we want to delete are stored in a different selection list data structure. We didn't arrive at the elements to delete through a top-to-bottom tree traversal.
In that case it might not only be a lot simpler to implement but also computationally efficient to store pointers/indices from child to parent, and possibly even element to leaf, since we're updating the tree in response to activity that occurred at the leaves in a bottom-up fashion. Otherwise you'd have to work from top to bottom and then back up again somehow, and the removal of such elements would have to be done centrally through the tree working in a top-to-bottom-and-back-up-again fashion.
To me the most useful cases I've found would be cases where the tree needs to update as a result of activity occurring in the leaves from the "outside world", so to speak, not in the middle of descending down the tree, and often involving two or more data structures, not just that one tree itself.
Another example is like, say you have a GUI widget which, upon being clicked, minimizes its parent widget. But we don't descend down the tree to determine what widget is clicked. We use another data structure for that like a spatial hash. So in that case we want to get from child widget to parent widget, but we didn't arrive at the child widget through top-down tree traversal of the GUI hierarchy so we don't have the parent widget readily available in a stack, e.g. We arrived at the child widget being clicked on through a spatial query into a different data structure. In that case we could avoid working our way down from root to child's parent if the child simply stored its parent.
Minimax is often illustrated with a tree,but I know that it can be implemented without the tree !However,I can not figure out how to do it without the tree!Can you clarify it for me?
Minimax by definition always works like a tree, no matter how you implement it. How you visualise it is another story.
Usually, Minimax is implemented recursively (which can be best visualised using a tree) or iteratively, which still goes through the nodes of a minimax tree, just with another approach.
As pointed out in the first comment, minimax is formally defined on a tree structure but for many practical applications it's not necessary to formally compute over the entire tree, and even the game tree structure does not need to be known beforehand- if the possible next moves and termination (game over) states are known, the tree can be built as the algorithm runs. For non-reversible games (like tic tac toe) duplicate states at different points of the tree have the same partial subtrees; hence the only structure that needs to be learned is the value of each state, calculated by minimax; these values can be cached as well for reuse during the algorithm.
By the way, one interesting and popular application of this 'non-explicit tree structure' use of minimax is Generative Adversarial Networks:
From the abstract
..a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game.
I'm new to data structures, and had a question on terminology. Is there a term for non-tree like graphs?
I realize that bidirectional/undirected graphs are inherently non-tree like. Is that the appropriate term? I'm asking because it seems that the tree is such a common subcategory of a graph that I figured there might be a term denoting all graphs that fall outside the subcategory.
P.s.: Please feel free to hack through any vernacular above. Would love tips on appropriate terminology in general concerning data structures.
I don't think there is a single universal term for a non-tree graph (except perhaps "non-tree graph" itself).
Trees are connected, acyclic, directed graphs, with some additional rules like each node (except the root) having exactly one parent. Some kinds of trees have other additional rules that are not common among other kinds of graphs (such as there being a significance to the order of a node's children). Depending on which of those limitations a non-tree graph violates, you might describe it differently.
A tree-like graph that is not fully connected can be described as a "forest". A forest has several root nodes, each anchoring a disjoint subtree.
If you have a graph with multiple root nodes, but their descendents overlap (so that a given child node may have more than one parent node), you have a "multitree". A human family tree may be a multitree if there there are no marriages between cousins or other relatives.
The next more general term is probably a "directed acyclic graph" or "DAG". A DAG is more general than a multitree because an ancestor node may be connected to a descendent node by more than one path. Human genealogical trees are more properly though of as DAGs, since sufficiently distant relatives are generally allowed to get married and have children (but nobody can be their own ancestor). There are many algorithms designed to work on DAGs, as forbidding cycles allows better performance for many useful applications (such as path finding).
More general still is a "directed graph" or "digraph", which relaxes the restrictions cycles. A common digraph data structure is an adjacency list (a list of arcs from one node to another).
I don't think there's any more general term beyond that, other than just "graph". If you have a specific application for a graph, there might be a specialized term for the kind of graph you will use (and perhaps algorithms or even library code to go along with it), but you'd need to ask about that specifically.
I'm tasked with creating a family tree drawing application. I ported the code for n-ary tree drawing from http://billmill.org/pymag-trees/ and while it's fine for drawing a tree where every node has only one parent but i need a way to let nodes have two parents and also allow parents to have other spouses and children with those spouses. I can't find any algorithms for family trees that would also include spouses.
I'm programming this in objective-c and i need it to look like a top down tree and not a force directed graph or circular tree.
I have to implement Fortunes algorithm for constructing Voronoi diagrams.
Important part of the algorithm is a data structure called "Beach Line Data Structure".
It is a binary balanced tree, similar to AVL, but different in a way that data is stored only on the leafs (there are other differences, but are unimportant for the question).
I am not sure how to implement it. Obviously using AVL "as is" will not work because when balancing AVL tree leaf node can become inner node and vice versa.
I also tried to look at some other known data structures at wikipedia, but none suits the needs.
I have seen some implementations that do this with a linked list, but this is not good because searching linked list is O(n), and it needs to be O(log n) for the algorithm to be efficient.
The leaves indeed store (single) points and the inner nodes of the event structure (the "beach line tree") stores ordered tuples of points whose parabolas/arcs lie next to each other. If the parabola that point Pa forms lies to the left of the parabola formed by Pb (and these two parabola's intersect), the inner node stores the ordered tuple (Pa, Pb).
Obviously using AVL "as is" will not work because when balancing AVL tree leaf node can become inner node and vice versa.
If you're worried about storing different types of objects in the AVL tree, a simple scheme would be to store the leaves as tuples too. So don't store point Pj as a leaf, but store the tuple (Pj, Pj) instead. If Pj as a leaf disappears from the event tree (beach line), and its parent is (Pi, Pj), simply change the parent into (Pj, Pj), and of course its parent will also needs to be changed from (Pj, P?) to (Pi, P?) etc. Just as with a regular AVL tree: you walk up the tree and modify the inner nodes that need to be changed and/or re-balanced.
Note that a good implementation of the algorithm can't be easily written down in a SO answer (at least, not by me!). For a proper explanation of the entire algorithm, including a description of the data structures used by it, see Computational geometry: algorithms and applications by Mark de Berg et al.. Chapter 7 is devoted solely to Voronoi diagrams.