How can I delete the node 1 from 2-3-4 Tree if the tree's structure is as below:
4 10
/ | \
2 6,8 12,14
/ \ / | \ / | \
1 3 5 7 9 11 13 15
The way I like to think of it, you only delete leaves or internal nodes that have a single child, and the children of whatever you delete have to stay the same level.
This requires pulling a key down from the level above to hold them, which merges with a sibling.
If the parent only has one key, this will cause a cascaded delete.
Deleting 1 by pulling down 2 causes a cascaded delete:
4 , 10
/ | \
X 6,8 12,14
| / | \ / | \
2,3 5 7 9 11 13 15
The cascaded delete pulls down the 4:
10
/ \
4,6,8 12,14
/ | | \ / | \
2,3 5 7 9 11 13 15
If the sibling is too big to merge, you may have to redistribute from the sibling. This would be required if this was a 2-3 tree, for example:
Redistributing a key from 6,8
6 , 10
/ | \
4 8 12,14
/ \ / \ / | \
2,3 5 7 9 11 13 15
Related
If we take some elements, partition them on the first element. Now taking the partition element as the root of a binary tree we insert these elements in the binary. Would there be a one to one correspondence?
Could someone explain how is there a one to one correspondence between elements??
In unoptimized quicksort, each element of the array appears in exactly one recursive call as the pivot. The tree of recursive calls can be viewed as a binary search tree.
For example, sorting 3 1 4 5 9 2 6 with ^ marking the pivots (in this case, always the first element of the subarray) at each level, and | marking boundaries between subarrays:
3 1 4 5 9 2 6
^
1 2 | 3 | 4 5 9 6
^ ^
1 | 2 | 3 | 4 | 5 9 6
^ ^
1 | 2 | 3 | 4 | 5 | 9 6
^
1 | 2 | 3 | 4 | 5 | 6 | 9
^
3
/ \
/ \
1 4
\ \
2 5
\
9
/
6
I have a text file:
10 1 15
10 12 30
10 9 45
10 8 40
10 15 55
12 9 0
12 7 18
12 10 1
9 1 1
9 2 1
9 0 1
14 5 5
And I would like to get this file as an output of my MapReduce job:
9 0 1
9 1 1
9 2 1
10 1 15
10 9 40
10 9 45
10 12 30
10 15 55
12 7 18
12 9 0
12 10 1
14 5 5
It means it has to be sorted by 1st, 2nd and 3rd columns.
I use this command:
#!/bin/bash
IN_DIR="/user/cloudera/temp"
OUT_DIR="/user/cloudera/temp_out"
NUM_REDUCERS=1
hdfs dfs -rmr ${OUT_DIR} > /dev/null
hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar \
-D mapred.jab.name="Parsing mista pages job 1 (parsing)" \
-D stream.num.map.output.key.fields=3 \
-D mapreduce.job.output.key.comparator.class=org.apache.hadoop.mapreduce.lib.partition.KeyFieldBasedComparator \
-D mapreduce.partition.keycomparator.options='-k1,1n -k2,2n -k3,3n' \
-D mapreduce.job.reduces=${NUM_REDUCERS} \
-mapper 'cat' \
-reducer 'cat' \
-input ${IN_DIR} \
-output ${OUT_DIR}
hdfs dfs -cat ${OUT_DIR}/* | head -100
And get exactly what I want. BUT. When I do NUM_REDUCERS=2 I get this output:
[cloudera#quickstart ~]$ hdfs dfs -cat /user/cloudera/temp_out/part-00000 | head -100
9 1 1
10 9 45
10 12 30
10 15 55
12 7 18
12 10 1
14 5 5
[cloudera#quickstart ~]$ hdfs dfs -cat /user/cloudera/temp_out/part-00001 | head -100
9 0 1
9 2 1
10 1 15
10 9 40
12 9 0
Why partitioner splits my data with same keys (for example '9') to different reducers?
How can I force partitioner to split Mapper output by the key and sort it by value. For example, if I have 4 reducers the reducers input should be:
reducer 1
9 0 1
9 1 1
9 2 1
reducer 2
10 1 15
10 9 40
10 9 45
10 12 30
10 15 55
reducer 3
12 7 18
12 9 0
12 10 1
reducer 4:
14 5 5
you can overwrite the default Partioner to put each key into diferent reduce .Set the same Nums of reduce . let each reduce to deal with only one key .
for example()
groupMap.put("9", 0);
groupMap.put("10", 1);
groupMap.put("12", 2);
groupMap.put("14", 3);
Add -partitioner argument to use your own partition in your job.
I think it might works for you
Wasn't sure how to word this question. Basically take the following BST:
25
/ \
20 30
/ \ / \
18 23 27 31
/ \ /\
8 19 22 24
If I were to delete the value 25 and rotate the value 20 in its place, does it make more sense to append the 23 subtree to 27, or to append the 30 subtree to 24. And I don't mean specific to this case, but from a broader perspective.
Just to be clear, what is preferable between these two arrangements:
20
/ \
18 23
/ \ / \
8 19 22 24
\
30
/ \
27 31
20
/ \
18 30
/ \ / \
8 19 27 31
/
23
/ \
22 24
rotating 20 would not be the wisest decision. You should either replace it with the maximum value of the left sub-tree or the minimum value of the right sub-tree.
If you do wish to rotate the way you have, there is no difference between the heights of the tree's and both will be same in terms of time complexity.
The goal is to remove 22 from the root node and re-balance the tree.
First I remove 22, and replace it by its in-order successor 28.
Secondly I rebalance the resulting tree, by moving the empty node to the left. The resulting tree is below.
Is moving 28 up the right procedure, and did I balance the left side correctly in the end?
22,34
/ | \
16 28 37
/ \ / \ / \
15 21 25 33 35 43
[28],34
/ | \
16 * 37
/ \ / \ / \
15 21 25 33 35 43
34
/ \
16,28 37
/ | \ / \
15 21,25 33 35 43
Thanks!
To delete 22 from
22,34
/ | \
16 28 37
/ \ / \ / \
15 21 25 33 35 43 ,
we replace it by its in-order successor 25, leaving a hole (*).
25,34
/ | \
16 28 37
/ \ / \ / \
15 21 * 33 35 43
We can't fix the hole by borrowing, so we merge its parent into its sibling, moving the hole up.
25,34
/ | \
16 * 37
/ \ | / \
15 21 28,33 35 43
The hole has two siblings now, so we can redistribute one of the parent's keys down.
34
/ \
16,25 37
/ | \ / \
15 21 28,33 35 43
(I'm working from this set of lecture notes. Don't bother memorizing the details here unless it's for an exam. Even then... I really wish data structure courses did not emphasize balanced search trees to the degree that they do.)
I have a binary tree, and pre-order traversal.
Here is my tree.
15
/ \
10 23
/\ /\
5 12 20 30
/ /
11 25
\
27
so result of pre-order : 15 10 5 12 11 23 20 30 25 27. It's OK
Than I delete 5,12 and 23 elements
Should I get this
15
/ \
10 27
\ /\
11 20 30
\
25
Result:15 10 11 27 20 30 25
or this?
15
/ \
10 25
\ /\
11 20 30
/
27
Result: 15 10 11 25 20 30 27
P.S I get 2nd case. If it isn't right, what is wrong with deletion?
UPD: SO the second updated variant is right?
Your 2nd case is almost right. 27 would be a left node of 30. When deleting a top node of a (sub)tree, you can either replace that node with the right-most node of the left branch or the left-most node of the right branch. In this case, you've replaced 30 with the left-most value of the right branch, which is 25. You'd have to perform this recursively as 25 has branches of its own. Once your target node to delete becomes a leaf, delete it.
First step:
15
/ \
10 25
\ /\
11 20 30
/
23
\
27
Second step:
15
/ \
10 25
\ /\
11 20 30
/
27
/
23
Third (deletion):
15
/ \
10 25
\ /\
11 20 30
/
27
If you want pre-order traversal of the remaining elements to be consistent with the pre-order traversal before the deletes, then your tree should look like this:
15
/ \
10 20
\ \
11 30
/
25
\
27
The delete method is:
If there's a left subtree of a deleted node, move the root of the left subtree to the deleted position. Follow the right subtree links in the (formerly) left subtree and attach the right subtree to the first empty right link. If there is no left subtree, move the root of the right subtree to the deleted position.