Apologies if this is the wrong place to ask this kind of question but seeing as it was algorithm based, I felt it fit.
What I am trying to do is figure out or find out the name of an algorithm I can use to apply to the following...
Currently, this is what happens.
I have a node tree that can have any number of starting nodes. (Not exactly a node tree, but the best analogy I could come up with).
These nodes then branch off into any number of other nodes
Resources are added to these nodes and associate themselves with all child nodes recursively.
Now the problem arises when I want to get all associated nodes for a particular resource. Getting all of them is fine, but not desirable. What I really want to do is only retrieve the top-most node for each association.
Edit
This is being done in JS using Sails as a framework with MySQL datasource.
{
name: 'Some Node Name'
children: [] // Array of child nodes
parent: 1 // Id of the parent node or null if it is top-level
resources: [] // Array of resources associated
}
If there is already an algorithm that tackles this already then I'd appreciate the direction.
Thanks.
If I understood the question correctly, the nodes you term top level resources refers to theose nodes which are not referenced by any other node; in terms of graph theory, these are the ones with indegree zero. From the description of the problem, these are the ones where parent is null.
Related
I'm learning about graph and DFS, and trying to do something similar to how ANT resolves the dependency. I'm confused about something and all the articles I read seems to assume everyone knows this.
I'm thinking of having a Map> with key = file, and value = set of files that the key depends on.
The DFS algorithm shows that I have to change the color of the node if it's already visited, that means the reference to the same fileNode must be the same between the one in key and the one in Set<> right?
Therefore, I'm thinking, each time a Node is created (including neighbor nodes), I would add it to one more Collection (maybe another Map?), then whenever a new Node is to be add to the graph (as key), search that Collection and use that reference instead? am I wasting too much space? How is it usually done? is there some other better way?
During my studies the DFS algorithm was implement like this:
Put all the nodes of a graph into a stack (this is a structure, where you can only retrieve and delete the first element).
Retrieve the first element, set it to seen, this can either be done through the coloring or by setting an attribute, lets call it isSeen, to true.
You then look at all the neighbors of that node, and if they are not seen already, you put them in the stack.
Once you looked at all the neighbors, you remove the node from the stack and retrieve the next element of the stack and do the same as for the first.
The result will then be, that all the nodes, that can be reached from the starting node, will have an attribute that is set to seen.
Hope this helped.
I have a DAG implementation that works perfectly for my needs. I'm using it as an internal structure for one of my projects. Recently, I came across a use case where if I modify the attribute of a node, I need to propagate that attribute up to its parents and all the way up to the root. Each node in my DAG currently has an adjacency list that is basically just a list of references to the node's children. However, if I need to propagate changes to the parents of this node (and this node can have multiple parents), I will need a list of references to parent nodes.
Is this acceptable? Or is there a better way of doing this? Does it make sense to maintain two lists (one for parents and one for children)? I thought of adding the parents to the same adjacency list but this will give me cycles (i.e., parent->child and child->parent) for every parent-child relationship.
It's never necessary to store parent pointers in each node, but doing so can make things run a lot faster because you know exactly where to look in order to find the parents. In your case it's perfectly reasonable.
As an analogy - many implementations of binary search trees will store parent pointers so that they can more easily support rotations (which needs access to the parent) or deletions (where the parent node may need to be known). Similarly, some more complex data structures like Fibonacci heaps use parent pointers in each node in order to more efficiently implement the decrease-key operation.
The memory overhead for storing a list of parents isn't going to be too bad - you're essentially now double-counting each edge: each parent stores a pointer to its child and each child stores a pointer to its parent.
Hope this helps!
Is it 'traditional' (or 'ethical') for a Node in a binary tree to keep a reference to its parents?
Normally, I would not think so, simply because a tree is a directed graph, and so the fact that the PARENT-->CHILD link is defined should not mean that CHILD --->PARENT is also defined.
In other words, by keeping a reference to the parent, we would somehow break the semantic of the tree.
But I would like to know what people think?
I asked because I was given a problem of finding the lowest common parent of two given nodes in a tree. If each node has a reference to its parent, the problem would be super easy to solve, but that feels like cheating!
Thanks
How you implement a binary tree should be dependent on your needs.
If your application requires tree traversal in the direction of leaf to trunk, then the best way to do so would be to implement references to parent nodes.
I find that it is better to fit your data structures to your needs rather than try to make workarounds with other logic. After all, why must a tree be a directed graph? Making it directed is a specific implementation, much like a list and its specific implementation as a singly- or doubly-linked list.
It can still be a directed graph of ownership. Consider the following node:
template <typename T>
struct node
{
T data_;
std::unique_ptr<node> left_child_; // I own my children.
std::unique_ptr<node> right_child_;
node* parent_; // Just lookin' at my parent.
};
As Steven Meyer said above, it's really not cheating: build the data structure to solve your problem, don't worry about the ethics of it :-)
Cross-posting from the Software Engineering Stack Exchange.
The Wikipedia definition states "For example, looking at a tree as a whole, one can talk about "the parent node" of a given node, but in general, as a data structure, a given node only contains the list of its children but does not contain a reference to its parent (if any)."
Is it conceptually possible to have a tree where you traverse it by starting at a given leaf node (rather than the root node) and use parent pointers to get to the root?
I ask this since I saw someone implement a tree and they used an array to hold all of the leaf nodes/external nodes and each of the leaf/external nodes point only to their parent nodes and those parent point to their parent node etc. until you get to the root node which has no parents. Their implementation thus would require you to start at one of the leaves to get to anywhere in the tree and you would cannot go "down" the tree since her tree nodes do not have any children pointers, only parent pointers.
I found this implementation interesting since I haven't seen anything like it but I was curious if it could still be considered it a "tree". I have never seen a tree where you start traversal at the leaves, instead of root. I have also never seen a tree where the tree nodes only have parent pointers and no children pointers.
Yep, this structure exists. It's often called a spaghetti stack.
Spaghetti stacks are useful for representing the "is a part of" relation. For example, if you want to represent a class hierarchy in a way that makes upcasting efficient, then you might represent the class hierarchy as a spaghetti stack in which the node for each type stores a pointer to its parent type. That way, it's easy to find whether an upcast is valid by just walking upward from the node.
They're also often used in compilers to track scoping information. Each node represents one scope in the program, and each node has a pointer to the node representing the scope one level above it.
You can also think of a blockchain this way. Each block stores a backreference to its parent block. By starting at any block and tracing backwards to the root, you can recover the state encoded by that block.
Hope this helps!
If an array A of leaf nodes are given, traversal is possible. If only a single leaf node is given, I don't know how to traverse the tree. Pseudocode:
// initial step
add all nodes in A to a queue Q
//removeNode(Q) returns pointer to node at front of Q
while((node = removeNode(Q)) != NULL)
/* do your operation on node */
add node->parent to Q
I have a data such that there are many parents each with 0-n children where each child can have 0-n nodes. Each node has a unique identifier (key) Ultimately, the parents are not connected to each other. It seems like this would be a list of trees, however that seems imprecise. I was thinking of joining them with a dummy root.
I need to be able to assembly a list of nodes that occur:
from any given node down (children)
from any given node down (children) then up to the root (up to the specific parent)
the top level parent of any given node (in an O(n) operation)
the level of the child in the tree (in an O(n) operation)
The structure will contain 300,000 nodes.
I was thinking perhaps I could implement a List of Trees and then also maintain a hash lookup structure that will reference a specific key value to provide me with a node as a starting point.
Is this a logical structure? Is there a better way to handle it? It seems crude to me.
If you are concerned in find a root node quickly you can think of create a tree where each node points to another tree.