Data structure/ Retrieving elements parent - algorithm

im looking a way to find out any common elements for two parent elements.
For example, parents here are 1 and 2 (Ignore the below values)
And the common value for those parents are 91.
Parent - value that is on top and has NO parent.
Next example :
Here we have 3 parents. and quite a lot of common elements for them. :
91,
92,
93,
911,
912,
931,
932,
9311,
9312.
Main problem is to get the comon elements. Mabey any suggestions on how could i store them aswell?

Run a BFS/DFS (doesn't really matter which one) from the first node and store a visited bit for every node (say in a vector/array of bool).
Now run the same algorithm again from the second node. Every time you reach a new node check if it has been visited by the first run as well. If it was then the node is one of the common parents so output to whatever you want.

Related

Optimal solution to fit the items in multiple capacity bags

I have been working on this problem (https://github.com/alexpchung/File-Distribution-Planning/blob/master/README.pdf) where I need to find an optimal solution to place the files in the node.
Here is my algorithm which I have used so far
Say number of nodes is N.
keep track of available file size for every node iterate through
every file, it has N choices to go to (assuming file fits in etc)
Recursively evaluate for every
Another solution which I have thought is to iterate through each and every node and do a knapsack 0/1. Unfortunately, i got struck because since the node sizes are not fixed it will be an incorrect solution.
If you have any pointers that would be great.
Thanks.
Maybe you can benchmark this:
Sort two lists.(capacity,size, all increasing)
Start from biggest file.
Also start from biggest node.
Check if it fits
true: put it in
false: put it to "failed" list since no bigger node exists.
If selected(biggest) node is full, iterate next smaller node
İterate to next smaller file.
go back to checking step until either one of conditions true
all files assigned, empty nodes exist
all nodes full, unplaced files exist
(*)sort only nodes on their empty spaces(empty nodes exist=true or both true)
duplicate node list with opposite order
check if "latest added file" on least "empty space"d can fit biggest "empty space"d node and if transition yields equal/balanced empty space on both
true: send file to that node
false: iterate on next "least empty space" node since that file can't fit others neighter
iterate both lists(and remove refined pairs from lists)
if at least 1 files could be refined, go to (*)

Find number of leaves under each node of a tree

I have a tree which is represented in the following format:
nodes is a list of nodes in the tree in the order of their height from top. Node at height 0 is the first element of nodes. Nodes at height 1 (read from left to right) are the next elements of nodes and so on.
n_children is a list of integers such that n_children[i] = num children of nodes[i]
For example given a tree like {1: {2, 3:{4,5,2}}}, nodes=[1,2,3,4,5,2], n_children = [2,0,3,0,0,0].
Given a Tree, is it possible to generate nodes and n_children and the number of leaves corresponding to each node in nodes by traversing the tree only once?
Is such a representation unique? Or is it possible for two different trees to have the same representation?
For the first question - creating the representation given a tree:
I am assuming by "a given tree" we mean a tree that is given in the form of node-objects, each holding its value and a list of references to its children-node-objects.
I propose this algorithm:
Start at node=root.
if node.children is empty return {values_list:[[node.value]], children_list:[[0]]}
otherwise:
3.1. construct two lists. One will be called values_list and each element there shall be a list of values. The other will be called children_list and each element there shall be a list of integers. Each element in these two lists will represent a level in the sub-tree beginning with node, including node itself (will be added at step 3.3).
So values_list[1] will become the list of values of the children-nodes of node, and values_list[2] will become the list of values of the grandchildren-nodes of node. values_list[1][0] will be the value of the leftmost child-node of node. And values_list[0] will be a list with one element alone, values_list[0][0], which will be the value of node.
3.2. for each child-node of node (for which we have references through node.children):
3.2.1. start over at (2.) with the child-node set to node, and the returned results will be assigned back (when the function returns) to child_values_list and child_children_list accordingly.
3.2.2. for each index i in the lists (they are of same length) if there is a list already in values_list[i] - concatenate child_values_list[i] to values_list[i] and concatenate child_children_list[i] to children_list[i]. Otherwise assign values_list[i]=child_values_list[i] and children_list[i]=child.children.list[i] (that would be a push - adding to the end of the list).
3.3. Make node.value the sole element of a new list and add that list to the beginning of values_list. Make node.children.length the sole element of a new list and add that list to the beginning of children_list.
3.4. return values_list and children_list
when the above returns with values_list and children_list for node=root (from step (1)), all we need to do is concatenate the elements of the lists (because they are lists, each for one specific level of the tree). After concatenating the list-elements, the resulting values_list_concatenated and children_list_concatenated will be the wanted representation.
In the algorithm above we visit a node only by starting step (2) with it set as node and we do that only once for each child of a node we visit. We start at the root-node and each node has only one parent => every node is visited exactly once.
For the number of leaves associated with each node: (if I understand correctly - the number of leaves in the sub-tree a node is its root), we can add another list that will be generated and returned: leaves_list.
In the stop-case (no children to node - step (2)) we will return leaves_list:[[1]]. In step (3.2.2) we will concatenate the list-elements like the other two lists' list-elements. And in step (3.3) we will sum the first list-element leaves_list[0] and will make that sum the sole element in a new list that we will add to the beginning of leaves_list. (something like leaves_list.add_to_eginning([leaves_list[0].sum()]))
For the second question - is this representation unique:
To prove uniqueness we actually want to show that the function (let's call it rep for "representation") preserves distinctiveness over the space of trees. i.e. that it is an injection. As you can see in the wiki linked, for that it suffices to show that there exists a function (let's call it tre for "tree") that given a representation gives a tree back, and that for every tree t it holds that tre(rep(t))=t. In simple words - that we can make a method that takes a representation and builds a tree out of it, and for every tree if we make its representation and passes that representation through that methos we'll get the exact same tree back.
So let's get cracking!
Actually the first job - creating that method (the function tre) is already done by you - by the way you explained what the representation is. But let's make it explicit:
if the lists are empty return the empty tree. Otherwise continue
make the root node with values[0] as its value and n_children[0] as its number of children (without making the children nodes yet).
initiate a list-index i=1 and a level index li=1 and level-elements index lei=root.children.length and a next-level-elements accumulator nle_acc=0
while lei>0:
4.1. for lei times:
4.1.1. make a node with values[i] as value and n_children[i] as the number of children.
4.1.2. add the new node as the leftmost child in level li that has not been filled yet (traverse the tree to the li level from the leftmost in right direction and assign the new node to the first reference that is not assigned yet. We know the previous level is done, so each node in the li-1 level has a children.length property we can check and see if each has filled the number of children they should have)
4.1.3. add nle_acc+=n_children[i]
4.1.4. increment ++i
4.2. assign lei=nle_acc (level-elements can take what the accumulator gathered for it)
4.3. clear nle_acc=0 (next-level-elements accumulator needs to accumulate from the start for the next round)
Now we need to prove that an arbitrary tree that is passed through the first algorithm and then through the second algorithm (this one here) will get out of all of that the same as it was originally.
As I'm not trying to prove the corectness of the algorithms (although I should), let's assume they do what I intended them to do. i.e. the first one writes the representation as you described it, and the second one makes a tree level-by-level, left-to-right, assigning a value and the number of children from the representation and fills the children references according to those numbers when it comes to the next level.
So each node has the right amount of children according to the representation (that's how the children were filled), and that number was written from the tree (when generating the representation). And the same is true for the values and thus it is the same tree as the original.
The proof actually should be much more elaborate and detailed - but I think I'll leave it at that now. If there will be a demand for elaboration maybe I'll make it an actual proof.

How to find nodes fast in an unordered tree

I have an unordered tree in the form of, for example:
Root
A1
A1_1
A1_1_1
A1_1_2
A1_1_2_1
A1_1_2_2
A1_1_2_3
A1_1_3
A1_1_n
A1_2
A1_3
A1_n
A2
A2_1
A2_2
A2_3
A2_n
The tree is unordered
each child can have a random N count of children
each node stores an unique long value.
the value required can be at any position.
My problem: if I need the long value of A1_1_2_3, first time I will traverse the nodes I do depth first search to get it, however: on later calls to the same node I must get its value without a recursive search. Why? If this tree would have hundreds of thousands of nodes until it reaches my A1_1_2_3 node, it would take too much time.
What I thought of, is to leave some pointers after the first traverse. E.g. for my case, when I give back the long value for A1_1_2_3 I also give back an array with information for future searches of the same node and say: to get to A1_1_2_3, I need:
first child of Root, which is A1
first child of A1, which is A1_1
second child of A1_1, which is A1_1_2
third child of A1_1_2, which is what I need: A1_1_2_3
So I figured I would store this information along with the value for A1_1_2_3 as an array of indexes: [0, 0, 1, 2]. By doing so, I could easily recreate the node on subsequent calls to the A1_1_2_3 and avoid recursion each time.
However the nodes can change. On subsequent calls, I might have a new structure, so my indexes stored earlier would not match anymore. But if this happens, I thought whnever I dont find the element anymore, I would recursively go back up a level and search for the item, and so on until I find it again and store the indexes again for future references:
e.g. if my A1_1_2_3 is now situated in this new structure:
A1_1
A1_1_0
A1_1_1
A1_1_2
A1_1_2_1
A1_1_2_2
A1_1_21_22
A1_1_2_3
... in this case the new element A1_1_0 ruined my stored structure, so I would go back up a level and search children again recursively until I find it again.
Does this even make sense, what I thought of here, or am I overcomplicating things? Im talking about an unordered tree which can have max about three hundreds of thousands of nodes, and it is vital that I can jump to nodes as fast as possible. But the tree can also be very small, under 10 nodes.
Is there a more efficient way to search in such a situation?
Thank you for any idea.
edit:
I forgot to add: what I need on subsequent calls is not just the same value, but also its position is important, because I must get the next page of children after that child (since its a tree structure, Im calling paging on nodes after the initially selected one). Hope it makes more sense now.

B-Trees insertion

How do I add 35?
How do I know whether to move a key up(up to the node with 34 and 78, and if I do that, which key do I move up) and make more children(to fulfill the "A non-leaf node with k children contains k−1 keys." rule)
OR
just split up the 39,44,56,74(and 35) node into three children, like what I did in step 8.
AFAIK the insertion procedure is the following:
Find a position (node). Insert. Split if necessary. In this case 39,44,56,74 becomes 35,39,44,56,74. You need to split now: the new nodes are 35,39 and 56, 74 and the parent is now 34,44,78
Look at this example.

How to do B-Tree Insert

I am trying to insert 3 values into this B-Tree, 60, 61, and 62. I understand how to insert values when a node is full, and has an empty parent, but what if the parent is full?
For example, when I insert 60 and 61, that node will now be full. I can't extend the parent, or the parent of the parent (because they are full). So can I change the values of the parent? I have provided an image of the B-tree prior to my insert, and after.
Attempt to insert 60, 61, 62:
Notice I changed the 66 in the root to 62, and added 62 to to the <72 node. Is this the correct way to do this?
With the insertion you've done, you get what's normally referred to as a B* tree. In a "pure" B-tree, the insertion when the root is full would require splitting the current root into two nodes, and creating a new root node above them (B-tree implementations do not require that root node to follow the same rule as other nodes for the minimum number of descendants, so having only two would be allowed).

Resources