What data structure should I use for hierarchical data? - data-structures

please help:
I have a data set need to be stored like tree, following is stucture, I am not sure how to store this, the letters are not comparable, this is just a hierarchic structure

From the picture, this just looks like a binary tree where only the leaf nodes contain data. So, use a binary tree and only put data in the leaf nodes.

Related

How to get all root to leaf paths in a tree or a graph from a flat structure?

I working on a simple water pipeline path modelling and listing each path.
The model is simple because the pipes do not create loops or grids. It consists of nodes which are representing the edge of each pipe segment.
Based on this we can say it is similar to a Binary Tree data model. However, as I understand, trees are hierarchical data structures. And also, I see on https://www.geeksforgeeks.org/print-root-leaf-path-without-using-recursion/?ref=lbp that, data is defined with left-right-left.left etc, by describing the exact location of each node.
In my case, the data should include only the start and end nodes for each pipe segment. Each node should also include info if it is the source (root) or leaf node. Number of leaf nodes will be equal to number of paths.
My model does not require any hierarchy and also it does not require left-right definition.
In this case we may say it is similar to a Graph, but my model also does not have loops or grids.
So please advise how to model this and create an algorithm.

What is the name of this data structure?

For the structure given in below data model, where each node is,
type Person {
firstName,
lastName,
Pointer to list of his children,
Pointer to next node
}
This data model neither looks like tree nor graph.
what is the name of this data model?
This is a tree in the left-child right-sibling representation.
A multi-child tree basically needs a dynamic data structure within each node to represent the children. Sometimes, fixed-size nodes are preferred for various reasons. This representation allows doing so in a fixed amount of space per node - the first child only is recorded, and all the children form a linked list. Obviously, searching the children of a node in this representation, is linear in the number of children.
Looks like a Directed acyclic graph to me.
https://en.wikipedia.org/wiki/Directed_acyclic_graph

Why storing data only in the leaf nodes of a balanced binary-search tree?

I have bought a nice little book about computational geometry. While reading it here and there, I often stumbled over the use of this special kind of binary search tree. These trees are balanced and should store the data only in the leaf nodes, whereas inner nodes should only store values to guide the search down to the leaves.
The following image shows an example of this trees (where the leaves are rectangles and the inner nodes are circles).
I have two questions:
What is the advantage of not storing data in the inner nodes?
For the purpose of learning, I would like to implement such a tree. Therefore, I thought it might be a good idea to use an AVL tree as the basis, but is it a good idea?
Any kind of helpful resource is very welcome.
What is the advantage of not storing data in the inner nodes?
There are some tree data structures that, by design, require that no data is stored in the inner nodes, such as Huffman code trees and B+ trees. In the case of Huffman trees, the requirement is that no two leaves have the same prefix (i.e. the path to node 'A' is 101 whereas the path to node 'B' is 10). In the case of B+ trees, it comes from the fact that it is optimized for block-search (this also means that every internal node has a lot of children, and that the tree is usually only a few levels deep).
For the purpose of learning, I would like to implement such a tree. Therefore, I thought it might be a good idea to use an AVL tree as the basis, but is it a good idea?
Sure! An AVL tree is not extremely complicated, so it's a good candidate for learning.
It is common to have other kinds of binary trees with data at the leaves instead of the interior nodes, but fairly uncommon for binary SEARCH trees.
One reason you might WANT to do this is educational -- it's often EASIER to implement a binary search tree this way then the traditional way. Why? Almost entirely because of deletions. Deleting a leaf is usually very easy, whereas deleting an interior node is harder/messier. If your data is only at the leaves, then you are always in the easy case!
It's worth thinking about where the keys on interior nodes come from. Often they are duplicates of keys that are also at the leaves (with data). Later, if the key at the leaf is deleted, the key at the interior nodes might still hang around.
What is the advantage of not storing data in the inner nodes?
In general, there is no advantage in not storing data in the inner nodes. For example, a red-black tree is a balanced tree and it stores its data into the inner and leaf nodes.
For the purpose of learning, I would like to implement such a tree. Therefore, I thought it might be a good idea to use an AVL tree as the basis, but is it a good idea?
In my opinion, it is.
One benefit to only keeping the data in leaf nodes (e.g., B+ tree) is that scanning/reading the data is exceedingly simple. The leaf nodes are linked together. So to read the next item when you are at the "end" (right or left) of the data within a given leaf node, you just read the link/pointer to the next (or previous) node and jump to the next leaf page.
With a B tree where data is in every node, you have to traverse the tree to read the data in order. That is certainly a well-defined process but is arguably more complex and typically requires more state information.
I am reading the same book and they say it could be done either way, data storage at external or at internal nodes.
The trees they use are Red-Black.
In any case, here is an article that stores data at internal nodes of a Red Black Tree and then links these data nodes together as a list.
Balanced binary search tree with a doubly linked list in C++
by Arjan van den Boogaard
http://archive.gamedev.net/archive/reference/programming/features/TStorage/default.html

Data structure for a with complex node

I have a directed-graph, each node is a complex data type.
Does anyone know How to build a data structure for this graph.
Like for the instance, picture here :
Thank in advance.
Looking at the image, it just looks like a directed graph. I assume what you mean by "complex data type" is that each vertex holds some sort of complex information, such as a hash table or something.
What I recommend you do if to create a dedicated Vertex class that holds the relevant information, then create a graph class using either an adjacency matrix or adjacency list implementation, depending on how dense/sparse/big/small the graph will be.

Are there any tools to find duplicate sections in a tree data structure?

I'm looking for a tool that finds duplicate nodes in a tree data structure (using Freemind to map the data structure, but I'll settle for anything I can export a generic data tree out too...)
The idea is that I can break the tree down into modules which I can repeat thus simplifying the structure of the tree.
I would just have a table of subtrees.
Walk the tree depth-first. At each node, after visiting sub-nodes, if there is an equivalent node in the table, replace the current node with the one in the table. If there is not an equivalent node in the table, then add the current node to the table.
Does that do it? I believe it's called common-subexpression-elimination.
Wouldn't it actually be better to prevent duplicate nodes in a tree? Why do you need duplicate nodes in a tree?

Resources