AVL trees insertion and deletion - data-structures

I would like to know whether I am applying the following insertion and deletion operations correctly on an AVL tree:
62
/ \
44 78
/ \ \
17 50 88
/ \
48 54
insert(42)
insert(90)
delete(62)
insert(92)
delete(50)
For this question, a deletion replaces the deleted item with its successor.
This is how I think the tree should be modified by those operations:
insert(42) and insert(90)
62
/ \
44 78
/ \ \
17 50 88
\ / \ \
42 48 54 90
delete(62)
78
/ \
44 88
/ \ \
17 50 90
\ / \
42 48 54
insert(92)
78
/ \
44 88
/ \ \
17 50 90
\ / \ \
42 48 54 92
delete(50)
78
/ \
44 88
/ \ \
17 54 90
\ / \
42 48 92

There are a two cases where rotations are needed:
___62___
/ \
__44__ 78
/ \ \
17 50 88
/ \
48 54
You had applied insert(42) correctly, but insert(90) creates an unbalanced subtree rooted at 78 (marked with asterisk): its right side has a height of 2, while its left side is empty:
___62___
/ \
__44__ 78*
/ \ \
17 50 88
\ / \ \
42 48 54 90
So, this will not stay like that: a simple left rotation will move 88 up, and 78 down:
___62___
/ \
__44__ 88
/ \ / \
17 50 78 90
\ / \
42 48 54
You had it correct for delete(62): that will swap the root with its successor, which is 78, and then 62 is removed:
___78___
/ \
__44__ 88
/ \ \
17 50 90
\ / \
42 48 54
insert(92) will bring an unbalance at node 88:
___78___
/ \
__44__ 88*
/ \ \
17 50 90
\ / \ \
42 48 54 92
And so a simple left rotation is again applied:
___78___
/ \
__44__ 90
/ \ / \
17 50 88 92
\ / \
42 48 54
delete(50) was correctly executed. Given the above state, we get:
___78___
/ \
__44__ 90
/ \ / \
17 54 88 92
\ /
42 48

Related

hadoop mapreduce.partition.keypartitioner.options not working

I only want to partition the data where the first field of key is same as the reducer. For example, [ 11 * * * ] data .
But it seems keypartitioner does not work, I really don't know why.
Environment
Hadoop Version
The code run.sh is here --->
#!/usr/bin/sh
hadoop fs -rm -r /training/likang/tmp2
hadoop fs -rm /training/likang/tmp/testfile
hadoop fs -put testfile1 /training/likang/tmp/testfile
hadoop-streaming -D stream.map.output.field.separator="\t" \
-D stream.num.map.output.key.fields=2 \
-D map.output.key.field.separator="\t" \
-D mapreduce.partition.keypartitioner.options=-k1,1 \
-D mapreduce.job.maps=2 \
-D mapreduce.job.reduces=2 \
-D mapred.job.name="lk_filt_rid" \
-partitioner org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner \
-input /training/likang/tmp/testfile \
-output /training/likang/tmp2 \
-mapper "cat" -reducer "cat"
hadoop fs -cat /training/likang/tmp2/part-00000
echo "------------------"
hadoop fs -cat /training/likang/tmp2/part-00001
The Input File is testfile1 --->
11 5 333 111
11 5 777 000
11 3 888 999
11 9 988 888
11 7 234 2342
11 5 4 4
15 9 230 134
12 8 232 834
15 77 220 000
15 33 256 399
11 5 999 888
15 9 222 111
14 88 372 233
15 9 66 77
11 5 821 221
11 0 11 11
15 0 22 22
12 0 33 33
14 0 44 44
The result is here, that all the [ 11 * * * * ] data is not sent to the same reducer... Does anybody know why? Thank you.
Now I konw , it's useful to delete this line
-D map.output.key.field.separator="\t" \
After delete this option, the result will be right, but more confused for the reason.
The default value of map.output.key.field.separator seem's just a Tab,but after I write it here, It makes fault.........

Binary Search Tree deletion procedure with children

Wasn't sure how to word this question. Basically take the following BST:
25
/ \
20 30
/ \ / \
18 23 27 31
/ \ /\
8 19 22 24
If I were to delete the value 25 and rotate the value 20 in its place, does it make more sense to append the 23 subtree to 27, or to append the 30 subtree to 24. And I don't mean specific to this case, but from a broader perspective.
Just to be clear, what is preferable between these two arrangements:
20
/ \
18 23
/ \ / \
8 19 22 24
\
30
/ \
27 31
20
/ \
18 30
/ \ / \
8 19 27 31
/
23
/ \
22 24
rotating 20 would not be the wisest decision. You should either replace it with the maximum value of the left sub-tree or the minimum value of the right sub-tree.
If you do wish to rotate the way you have, there is no difference between the heights of the tree's and both will be same in terms of time complexity.

Proper way to Re-Balance a 2-3 Tree after deleting the root node

The goal is to remove 22 from the root node and re-balance the tree.
First I remove 22, and replace it by its in-order successor 28.
Secondly I rebalance the resulting tree, by moving the empty node to the left. The resulting tree is below.
Is moving 28 up the right procedure, and did I balance the left side correctly in the end?
22,34
/ | \
16 28 37
/ \ / \ / \
15 21 25 33 35 43
[28],34
/ | \
16 * 37
/ \ / \ / \
15 21 25 33 35 43
34
/ \
16,28 37
/ | \ / \
15 21,25 33 35 43
Thanks!
To delete 22 from
22,34
/ | \
16 28 37
/ \ / \ / \
15 21 25 33 35 43 ,
we replace it by its in-order successor 25, leaving a hole (*).
25,34
/ | \
16 28 37
/ \ / \ / \
15 21 * 33 35 43
We can't fix the hole by borrowing, so we merge its parent into its sibling, moving the hole up.
25,34
/ | \
16 * 37
/ \ | / \
15 21 28,33 35 43
The hole has two siblings now, so we can redistribute one of the parent's keys down.
34
/ \
16,25 37
/ | \ / \
15 21 28,33 35 43
(I'm working from this set of lecture notes. Don't bother memorizing the details here unless it's for an exam. Even then... I really wish data structure courses did not emphasize balanced search trees to the degree that they do.)

convert comma separated list in text file into columns in bash

I've managed to extract data (from an html page) that goes into a table, and I've isolated the columns of said table into a text file that contains the lines below:
[30,30,32,35,34,43,52,68,88,97,105,107,107,105,101,93,88,80,69,55],
[28,6,6,50,58,56,64,87,99,110,116,119,120,117,114,113,103,82,6,47],
[-7,,,43,71,30,23,28,13,13,10,11,12,11,13,22,17,3,,-15,-20,,38,71],
[0,,,3,5,1.5,1,1.5,0.5,0.5,0,0.5,0.5,0.5,0.5,1,0.5,0,-0.5,-0.5,2.5]
Each bracketed list of numbers represents a column. What I'd like to do is turn these lists into actual columns that I can work with in different data formats. I'd also like to be sure to include that blank parts of these lists too (i.e., "[,,,]")
This is basically what I'm trying to accomplish:
30 28 -7 0
30 6
32 6
35 50 43 3
34 58 71 5
43 56 30 1.5
52 64 23 1
. . . .
. . . .
. . . .
I'm parsing data from a web page, and ultimately planning to make the process as automated as possible so I can easily work with the data after I output it to a nice format.
Anyone know how to do this, have any suggestions, or thoughts on scripting this?
Since you have your lists in python, just do it in python:
l=[["30", "30", "32"], ["28","6","6"], ["-7", "", ""], ["0", "", ""]]
for i in zip(*l):
print "\t".join(i)
produces
30 28 -7 0
30 6
32 6
awk based solution:
awk -F, '{gsub(/\[|\]/, ""); for (i=1; i<=NF; i++) a[i]=a[i] ? a[i] OFS $i: $i}
END {for (i=1; i<=NF; i++) print a[i]}' file
30 28 -7 0
30 6
32 6
35 50 43 3
34 58 71 5
43 56 30 1.5
52 64 23 1
..........
..........
Another solution, but it works only for file with 4 lines:
$ paste \
<(sed -n '1{s,\[,,g;s,\],,g;s|,|\n|g;p}' t) \
<(sed -n '2{s,\[,,g;s,\],,g;s|,|\n|g;p}' t) \
<(sed -n '3{s,\[,,g;s,\],,g;s|,|\n|g;p}' t) \
<(sed -n '4{s,\[,,g;s,\],,g;s|,|\n|g;p}' t)
30 28 -7 0
30 6
32 6
35 50 43 3
34 58 71 5
43 56 30 1.5
52 64 23 1
68 87 28 1.5
88 99 13 0.5
97 110 13 0.5
105 116 10 0
107 119 11 0.5
107 120 12 0.5
105 117 11 0.5
101 114 13 0.5
93 113 22 1
88 103 17 0.5
80 82 3 0
69 6 -0.5
55 47 -15 -0.5
-20 2.5
38
71
Updated: or another version with preprocessing:
$ sed 's|\[||;s|\][,]\?||' t >t2
$ paste \
<(sed -n '1{s|,|\n|g;p}' t2) \
<(sed -n '2{s|,|\n|g;p}' t2) \
<(sed -n '3{s|,|\n|g;p}' t2) \
<(sed -n '4{s|,|\n|g;p}' t2)
If a file named data contains the data given in the problem (exactly as defined above), then the following bash command line will produce the output requested:
$ sed -e 's/\[//' -e 's/\]//' -e 's/,/ /g' <data | rs -T
Example:
cat data
[30,30,32,35,34,43,52,68,88,97,105,107,107,105,101,93,88,80,69,55],
[28,6,6,50,58,56,64,87,99,110,116,119,120,117,114,113,103,82,6,47],
[-7,,,43,71,30,23,28,13,13,10,11,12,11,13,22,17,3,,-15,-20,,38,71],
[0,,,3,5,1.5,1,1.5,0.5,0.5,0,0.5,0.5,0.5,0.5,1,0.5,0,-0.5,-0.5,2.5]
$ sed -e 's/[//' -e 's/]//' -e 's/,/ /g' <data | rs -T
30 28 -7 0
30 6 43 3
32 6 71 5
35 50 30 1.5
34 58 23 1
43 56 28 1.5
52 64 13 0.5
68 87 13 0.5
88 99 10 0
97 110 11 0.5
105 116 12 0.5
107 119 11 0.5
107 120 13 0.5
105 117 22 1
101 114 17 0.5
93 113 3 0
88 103 -15 -0.5
80 82 -20 -0.5
69 6 38 2.5
55 47 71

Binary tree deletion

I have a binary tree, and pre-order traversal.
Here is my tree.
15
/ \
10 23
/\ /\
5 12 20 30
/ /
11 25
\
27
so result of pre-order : 15 10 5 12 11 23 20 30 25 27. It's OK
Than I delete 5,12 and 23 elements
Should I get this
15
/ \
10 27
\ /\
11 20 30
\
25
Result:15 10 11 27 20 30 25
or this?
15
/ \
10 25
\ /\
11 20 30
/
27
Result: 15 10 11 25 20 30 27
P.S I get 2nd case. If it isn't right, what is wrong with deletion?
UPD: SO the second updated variant is right?
Your 2nd case is almost right. 27 would be a left node of 30. When deleting a top node of a (sub)tree, you can either replace that node with the right-most node of the left branch or the left-most node of the right branch. In this case, you've replaced 30 with the left-most value of the right branch, which is 25. You'd have to perform this recursively as 25 has branches of its own. Once your target node to delete becomes a leaf, delete it.
First step:
15
/ \
10 25
\ /\
11 20 30
/
23
\
27
Second step:
15
/ \
10 25
\ /\
11 20 30
/
27
/
23
Third (deletion):
15
/ \
10 25
\ /\
11 20 30
/
27
If you want pre-order traversal of the remaining elements to be consistent with the pre-order traversal before the deletes, then your tree should look like this:
15
/ \
10 20
\ \
11 30
/
25
\
27
The delete method is:
If there's a left subtree of a deleted node, move the root of the left subtree to the deleted position. Follow the right subtree links in the (formerly) left subtree and attach the right subtree to the first empty right link. If there is no left subtree, move the root of the right subtree to the deleted position.

Resources