Is there a rule for how to split the node in 2-3-4 tree?
E.g. If I insert 3, 7, 4, 9 into the 2-3-4 tree:
Will it be split like this (yellow) or that (green) as shown here:
Are both valid?
Green. You need to consider the algorithm steps. Check out the the wikipedia page for insertion steps. The key part is to split a 4-node (which has 3 values) by moving the middle value up a level before considering the next insert.
1. Insert 3 into blank. Result: 3 (a 2-node)
2. Insert 7. Result: 3 - 7 (a 3-node)
3. Insert 4. Result: 3 - 4 - 7 (a 4-node)
5. Insert 9. There is already a 4-node, so this must be split.
The split will be to move 4 up a level, and 3 and 7 are now child nodes of 4
(like your green diagram). 9 is then added next to the 7.
Related
Show the heap at each stage when the following numbers are inserted to an initially empty min-heap in the given order: {11, 17, 13, 4, 4, 1 }. Now, show the heap at each stage when we successively perform the deleteMin operation on the heap until it is empty.
Here is the answer/checkpoint I receive:
![1]https://imgur.com/zu47RIF
I have 2 questions please:
I don't understand when we insert element 4 the second time, why do we shift 11 to make it the right child of the old element/firstly inserted element 4? Is it because we want to satisfy a requirement of the complete binary tree, which is each node in the levels from 1 to k - 2 has exactly 2 children (k = levels of the trees, level k is the bottom-most level)?
I don't understand how we deleteMin = 1, 13 becomes the right child of the newly parent 11 (which is the left child of 4). Just a quick note that my instructor gave the class 2 ways to deleteMin. The other way is fine with me - it's just the reversed process of inserting.
Like you said, the heap shape is an "almost complete tree": all levels are complete, except the lowest level which can be incomplete to the right. Therefore, the second 4 is necessarily added to the right of 17 to preserve the heap shape:
4
/ \
11 13
/ \
17 4
After that, 4 switches places with 11 to regain the min-heap property.
Deletions are typically implemented by removing the root and putting the last (i.e., bottom-rightmost) element in its place. This preserves the heap shape. The new root is then allowed to sift down in order to regain the min-heap property. So 13 becomes the new root:
13
/ \
4 4
/ \
17 11
Then 13 switches places with either child node. It looks like they chose the right-hand child in your example.
I had class yesterday for data structures and my professor was discussing the uses of binary trees, and some info on its levels. I'm starting to work on creating a binary search tree, and I want to include a print function to display level on a tree
Tree example:
3 is root. 3 has two childs, to the left is 2 and and its right is 5.
2 has a child node to its left, 1.
5 has 2 child nodes. 4 is its left child node and 6 is the right child node.
I want to print level 1 of the tree (the 2 and 5)
3
/ \
2 5
/ / \
1 4 6
I want to have a print function in my program to show a level in a tree, but I need some reason for doing that (my professor wants a reason, idk why and i didnt ask). Any reasons to show a level of the tree?
I'm trying to learn designing a btree.
Here are the values to develop a btree of order 5.
1,12,8,2,25,6,14,28,17,7,52,16,48,68,3,26,29,53,55,45,67.
When I insert 25, it breaks into child nodes
8
/ \
1 2 12 25
may I I know on what basis 8 comes up as parent ? Why not any other number ? What if the order of btree would be 4 ?
In a B-tree of order 5 each node (except the root) must have 2 to 4 values in it.
At the point you enter 25, the node has the values 1,2,8,12. In order to have at least 2 values in each new child (1,2) and (12,25) you have to split at 8.
Before inserting 25, the node state is like this:
Full node: 1, 2, 8, 12. Items in node are always sorted in B-tree.
When you insert the new item 25, the sequence becomes: 1, 2, 8, 12, 25.
In this sequence the middle item is the one that is promoted up.
A node split divides the data items equally: Half go to the newly created node,
and half remain in the old one and the middle one goes up. This is the reason why 8 goes upward.
The following figures contain a B-tree of order 5 and should help understand this situation better although the data inserted is different. In the sequence on right-side, the arrow indicates the item to be promoted upward.
I have got the following sequence (representing a tree):
4 2
1 4
3 4
5 4
2 7
0 7
6 0
Now, I am trying to sort this sequence, such that when a value appears on the left (column 1), it has already appeared on the right (column 2). More concretely, the result of the sorting algorithm should be:
1 4
3 4
5 4
4 2
2 7
6 0
0 7
Obviously, this works in O(n^2) with an algorithm iterating over each entry of column 1 and then look for corresponding entries in column two. But as n can be quite big (> 100000) in my scenario, I'm looking for a O(n log n) way to do it. Is this even possible?
Assumption:
I'm assuming this is also a valid sort sequence:
1 4
4 2
3 4
5 4
2 7
6 0
0 7
i.e. Once a value appears once on the right, it can appear on the left.
If this is not the case (i.e. all occurrences on the right has to be before any occurrence on the left), ignore the "remove all edges pointing to that element" part and only remove the intermediate element if it has no incoming edges left.
Algorithm:
Construct a graph where each element A points to another element B if the right element of A is equal to the left element of B. This can be done using a hash multi-map:
Go through the elements, inserting each element A into the hash map as A.left -> A.
Go through the elements again, connecting each element B with all elements appearing under B.right.
Perform a topological sort of the graph, giving you your result. I should be modified such that, instead of removing an edge pointing to an element, we remove all edges pointing to that element (i.e. if we already found an element containing some element on the right, we don't need to find another for that element to appear on the left).
Currently this is O(n2) running time, because there are too many edges - if we have:
(1,2),(1,2),...,(1,2),(2,3),(2,3),...,(2,3)
There are O(n2) edges.
This can be avoided by, instead of having elements point directly to each other, create an intermediate element. In the above case, 1/2 the elements will point to that element and that element will point to the other half. Then, when doing the topological sort, when we would've remove an edge to that element, we instead remove that element and all edges pointing from / to it.
Now there will be a maximum of O(n) edges, and, since topological sort can be done in linear time with respect to the elements and edges, the overall running time is O(n).
Note that it's not always possible to get a result: (1,2), (2,1).
Illustrations:
For your example (pre-optimization), we'd have:
For my example above, we'd have:
I have a row with numbers 1:n. I'm looking to add a second row also with the numbers 1:n but these should be in a random order while satisfying the following:
No positions have the same number in both rows
No combination of numbers occurs twice
For example, in the following
Row 1: 1 2 3 4 5 6 7 ...
Row 2: 3 6 15 8 13 12 7 ...
the number 7 occurs at the same position in both rows 1 and 2 (namely position 7; thereby not satisfying rule 1)
while in the following
Row 1: 1 2 3 4 5 6 7 ...
Row 2: 3 7 15 8 13 12 2 ...
the combination of 2+7 appears twice (in positions 2 and 7; thereby not satisfying rule 2).
It would perhaps be possible – but unnecessarily time-consuming – to do this by hand (at least up until a reasonable number), but there must be quite an elegant solution for this in MATLAB.
This problem is called a derangment of a permutation.
Use the function randperm, in order to find a random permutation of your data.
x = [1 2 3 4 5 6 7];
y = randperm(x);
Then, you can check that the sequence is legal. If not, do it again and again..
You have a probability of about 0.3 each time to succeed, which means that you need roughly 10/3 times to try until you find it.
Therefore you will find the answer really quickly.
Alternatively, you can use this algorithm to create a random derangment.
Edit
If you want to have only cycles of size > 2, this is a generalization of the problem.
In it is written that the probability
in that case is smaller, but big enough to find it in a fixed amount of steps. So the same approach is still valid.
This is fairly straightforward. Create a random permutation of the nodes, but interpret the list as follows: Interpret it as a random walk around the nodes, and if node 'b' appears after node 'a', it means that node 'b' appears below node 'a' in the lists:
So if your initial random permutation is
3 2 5 1 4
Then the walk in this case is 3 -> 2 -> 5 -> 1 -> 4 and you creates the rows as follows:
Row 1: 1 2 3 4 5
Row 2: 4 5 2 3 1
This random walk will satisfy both conditions.
But do you wish to allow more than one cycle in your network? I know you don't want two people to have each other's hat. But what about 7 people, where 3 of them have each other's hats and the other 4 have each other's hats? Is this acceptable and/or desirable?
Andrey has already pointed you to randperm and the rejection-sampling-like approach. After generating a permutation p, an easy way to check whether it has fixed point is any(p==1:n). An easy way to check whether it contains cycles of length 2 is any(p(p)==1:n).
So this gets permutations p of 1:n fulfilling your requirements:
p=[];
while (isempty(p))
p=randperm(n);
if any(p==1:n), p=[];
elseif any(p(p)==1:n), p=[];
end
end
Surrounding this with a for loop and for each counting the iterations of the while loop, it seems that one needs to generate on average 4.5 permutations for every "valid" one (and 6.2 if cycles of length three are not allowed, either). Very interesting.