Balance BST tree manually - data-structures

I've done balancing of the tree(bst>avl) requested by hand and I wonder that it was really easy, so I am not sure whether I've done it correctly.
a
/ \
b e3
/ \
e1 e2
initial state is:
'a' is parent of 'b'(left) and 'e3'(right), 'b' is a parent of 'e1'(left) and 'e2'(right).
applying right rotation gives us:
b
/ \
e1 a
/ \
e2 e3
'b' in place of 'a' with child 'e1' on the left and 'a' child on the right, 'a' gets 'e2' of 'b' on the left.
So the questions:
If e1 is itself a subtree or node containing other elements, can I still do this rotation?
2. If e2 and e3 are absent, can I still do this rotation?
example 11; 12;16
16
/
13
/
10
intial state: 16 is a parent of 13 and 13 is a parent of 10.
Can I do from it: 13 is a parent of 10(left) and 16(right)
I know it's simplistic, but theory often does not cover these thing assuming it's clear, well not for everyone.
Thanks for help,

Yes to everything, really. Think about the order property: left descendants < node and node < right descendants. Note how the rotation preserves this; compare a and b to e1, e2 and e3 before the rotation, and check the order and descendent relationships after the rotation. I'll let you ponder how before giving it away.

Related

In Addition Algorithm binary what is point of multiplying carry with 2?

To add a and b, first add their rightmost bits. This gives
a0 + b0 = c0 ⋅ 2 + s0,

where s0 is the rightmost bit in the binary expansion of a + b and c0 is the carry, which is either
0 or 1.
Then add the next pair of bits and the carry, a1 +b1 +c0 =c1 ⋅2+s1,
since we just add the carry C0 with the next operation without multiplying by 2 ???? why? or i am wrong here?
thanks in advance
I will try to explain this with a simple example.
We are adding 3(a) + 5(b) or 11 + 101. Following the described algorithm above we get.
To add 11(a) and 101(b), first add their rightmost bits. 1(a0) + 1(b0). This gives 1(a0) + 1(b0) = 1(c0) * 2 + 0(s0).
Multiplying by 2 here in binary is a bitshift, you are moving that 1 to the next binary place in the number so 1*2+0 = 10 which is the result of 1+1.
So following the next pair of bits 1(a1) + 0(b1) + 1(c0) = 1(c1)*2+0(s1).
This may seem counterintuitive, but the c0 digit originally belongs to the first binary place (0th index), by multiplying it by 2 we ensure we can add it with the bits on the second binary place (1st index).
The addition on a1 +b1 +c0 =c1 ⋅2+s1 is not possible without this 2* multiplication on c0 for we would be adding otherwise 10(a1) + 00(b1) + 01(c0) which is not the desired result for the second binary place

Interview question: minimum number of swaps to make couples sit together

This is an interview question, and the problem description is as follows:
There are n couples sitting in a row with 2n seats. Find the minimum number of swaps to make everyone sit next on his/her partner. For example, 0 and 1 are couple, and 2 and 3 are couple. Originally they are sitting in a row in this order: [2, 0, 1, 3]. The minimum number of swaps is 1, for example swapping 2 with 1.
I know there is a greedy solution for this problem. You just need to scan the array from left to right. Every time you see an unmatched pair, you swap the first person of the pair to his/her correct position. For example, in the above example for pair [2, 0], you will directly swap 2 with 1. There is no need to try swapping 0 with 3.
But I don't really understand why this works. One of the proofs I saw was something like this:
Consider a simple example: 7 1 4 6 2 3 0 5. At first step we have two choices to match the first couple: swap 7 with 0, or swap 1 with 6. Then we get 0 1 4 6 2 3 7 5 or 7 6 4 1 2 3 0 5. Pay attention that the first couple doesn't count any more. For the later part it is composed of 4 X 2 3 Y 5 (X=6 Y=7 or X=1 Y=0). Since different couples are unrelated, we don't care X Y is 6 7 pair or 0 1 pair. They are equivalent! Thus it means our choice doesn't count.
I feel that this is very reasonable but not compelling enough. In my opinion we have to prove that X and Y are couple in all possible cases and don't know how. Can anyone give a hint? Thanks!
I've split the problem into 3 examples. A's are a pair and so are B's in all examples. Note that throughout the examples a match requires that elements are adjacent and the first element occupy an index that satisfies index%2 = 0. An array looking like this [X A1 A2 ...] does not satisfy this condition, however this does [X Y A1 A2 ...]. The examples also do not look to the left at all, because looking to the left of A2 below is the same as looking to the right of A1.
First example
There's an even number of elements between two unmatched pairs:
A1 B1 ..2k.. A2 B2 .. for any number k in {0, 1, 2, ..} meaning A1 B1 A2 B2 .. is just a another case.
Both can be matched in one swap:
A1 A2 ..2k.. B1 B2 .. or B2 B1 ..2k.. A2 A1 ..
Order is not important, so it doesn't matter which pair is first. Once the pairs are matched, there will be no more swapping involving either pair. Finding A2 based on A1 will result in the same amount of swaps as finding B2 based on B1.
Second example
There's an odd number of elements between two pairs (2k + the element C):
A1 B1 ..2k.. C A2 B2 D .. (A1 B1 ..2k.. C B2 A2 D .. is identical)
Both cannot be matched in one swap, but like before it doesn't matter which pair is first nor if the matched pair is in the beginning or in the middle part of the array, so all these possible swaps are equally valid, and none of them creates more swaps later on:
A1 A2 ..2k .. C B1 B2 D .. or B2 B1 ..2k.. C A2 A1 D .. Note that the last pair is not matched
C B1 ..2k.. A1 A2 B2 D .. or A1 D ..2k.. C A2 B2 B1 .. Here we're not matching the first pair.
The important thing about this is that in each case, only one pair is matched and none of the elements of that pair will need to be swapped again. The result of the remaining non-matched pair are either one of:
..2k.. C B1 B2 D ..
..2k.. C A2 A1 D ..
C B1 ..2k.. B2 D ..
A1 D ..2k.. C A2 ..
They are clearly equivalent in terms of swaps needed to match the remaining A's or B's.
Third example
This is logically identical to the second. Both B1/A2 and A2/B2 can have any number of elements between them. No matter how elements are swapped, only one pair can be matched. m1 and m2 are arbitrary number of elements. Note that elements X and Y are just the elements surrounding B2, and they're only used to illustrate the example:
A1 B1 ..m1.. A2 ..m2.. X B2 Y .. (A1 B1 ..m1.. B2 ..m2.. X A2 Y .. is identical)
Again both pairs cannot be matched in one swap, but it's not important which pair is matched, or where the matched pair position is:
A1 A2 ..m1.. B1 ..m2.. X B2 Y .. or B2 B1 ..m1.. A2 ..m2.. X A1 Y .. Note that the last pair is not matched
A1 X ..m1.. A2 ..m2-1.. B1 B2 Y .. or A1 Y ..m1.. A2 ..m2.. X B2 B1.. depending on position of B2. Here we're not matching the first pair.
Matching the pair around A2 is equivalent, but omitted.
As in the second example, one swap can also be matching a pair in the beginning or in the middle of the array, but either choice doesn't change that only one pair is matched. Nor does it change the remaining amount of unmatched pairs.
A little analysis
Keeping in mind that matched pairs drop out of the list of unmatched/problem pairs, the list of unmatched pairs are either one fewer or two fewer pairs for each swap. Since it's not important which pair drops out of the problem, it might as well be the first. In that case we can assume that pairs to the left of the cursor/current index are all matched. And that we only need to match the first pair, unless it's already matched by coincidence and the cursor is then rightfully moved.
It becomes even more clear if the above examples are looked at with the cursor being at the second unmatched pair, instead of the first. It still doesn't matter which pairs are swapped for the amount of total swaps needed. So there's no need to try to match pairs in the middle. The resulting amount of swaps are the same.
The only time two pairs can be matched with only one swap are those in the first example. There is no way to match two pairs in one swap in any other setup. Looking at the result of the swap in the second and third examples, it also becomes clear that none of the results have any advantage to any of the others and that each result becomes a new problem that can be described as one of the three cases (two cases really, because second and third are equivalent in terms of match-able pairs).
Optimal swapping
There is no way to modify the array to prepare it for more optimal swapping later on. Either a swap will match one or two pairs, or it will count as a swap with no matches:
Looking at this: A1 B1 ..2k.. C B2 ... A2 ...
Swap to prepare for optimal swap:
A1 B1 ..2k.. A2 B2 ... C ... no matches
A1 A2 ..2k.. B1 B2 ... C ... two in one
Greedy swap:
B2 B1 ..2k.. C A1 ... A2 ... one
B2 B1 ..2k.. A2 A1 ... C ... one
Un-matching pairs
Pairs already matched will not become unmatched because that would require that:
For A1 B1 ..2k.. C A2 B2 D ..
C is identical to A1 or
D is identical to B1
either of which is impossible.
Likewise with A1 B1 ..m1.. (Z) A2 (V) ..m2.. X B2 Y ..
Or it would require that matched pairs are shifted one (or any odd number of) index inside the array. That's also not possible, because we always ever swap, so the array elements aren't being shifted at all.
[Edited for clarity 4-Mar-2020.]
There is no point doing a swap which does not put (at least) one couple together. To do so would add 1 to the swap count and leave us with the same number of unpaired couples.
So, each time we do a swap, we put one couple together leaving at most n-1 couples. Repeating the process we end up with 1 pair, who must by then be a couple. So, the worst case must be n-1 swaps.
Clearly, we can ignore couples who are already together.
Clearly, where we have two pairs a:B b:A, one swap will create the two couples a:A b:B.
And if we have m pairs a:Q b:A c:B ... q:P -- where the m pairs are a "disjoint subset" (or cycle) of couples, m-1 swaps will put them into couples.
So: the minimum number of swaps is going to be n - s where s is the number of "disjoint subsets" (and s >= 1). [A subset may, of course, contain just one couple.]
Interestingly, there is nothing clever you can do to reduce the number of swaps. Provided every swap creates a couple you will do the minimum number.
If you wanted to arrange each couple in height order as well, things may or may not be more interesting.
FWIW: having shown that you cannot do better than n-1 swaps for each disjoint set of n couples, the trick then is to avoid the O(n^2) search for each swap. That can be done relatively straightforwardly by keeping a vector with one entry per person, giving where they are currently sat. Then in one scan you pick up each person and if you know where their partner is sat, swap down to make a pair, and update the location of the person swapped up.
I will swap every even positioned member,
if he/she doesn't sit besides his/her partner.
Even positioned means array indexed 1, 3, 5 and so on.
The couples are [even, odd] pair. For example [0, 1], [2, 3], [4, 5] and so on.
The loop will be like that:
for(i=1; i<n*2; i+=2) // when n = # of couples.
Now, we will check i-th and (i-1)-th index member. If they are not couple, then we will look for the partner of (i-1)-th member and once we have it, we should swap it with i-th member.
For an example, say at i=1, we got 6, now if (i-1)-th element is 7 then they form a couple (if (i-1)-th element is 5 then [5, 6] is not a couple.) and we don't need any swap, otherwise we should look for the partner of (i-1)-th element and will swap with i-th element. So, (i-1)-th and i-th will form a couple.
It ensure that we need to check only half of the total members, that means, n.
And, for any non-matched couple, we need a linear search from i-th position to the rest of the array. Which is O(2n), eventually O(n).
So, the overall technique complexity will be O(n^2).
In worst case, minimum swap will be n-1. (this is maximum as well).
Very straightforward. If you need help to code, let us know.

Priority Search Tree confusion

The only reasonable slide set I found is this, which in page 15 says, for building:
Sort all points by their x coordinate value and store them in the
leaf nodes of a balanced binary tree (i.e., a range tree)
Starting at the root, each node contains the point in its subtree with the maximum value for its y coordinate that has not been stored
at a shallower depth in the tree; if no such node exists, then node
is empty
I implemented a Range Tree just before, so based on the example I used there, I now have (for a Priority Search Tree):
7
------ 0 ---------------------
| |
6 6
---- -3 ---- ---- 4 ------
| | | |
4 2 1 3
---- -4 - -2 --- 3 --- 5
| | / \ | | / \
0 (-3,4) (-2,2)(0,7) NIL (4,-4) (5,3)(6,-1)
-5 2
/ \ / \
(-5,6) (-4,0) (2,1) (3,6)
Here, every inner node is of the form:
mid_x
max y
and the range query I am posing is (-inf, 1) x (-0.5, 5), i.e. (x_low, x_high) x (y_low, y_high).
So, let's start from the root, I check if x (=0) is in (-inf, 1) and
if 7 > (-0.5, 5). It is, thus I proceed in both children, but for
simplicity, let me follow the left one (in all cases from now).
I check if -3 is the x range and if 6 is more or equal than the
upper bound of the y range of the query (i.e. 5). It is, so I
continue.
Same for the next level, thus we go to the next level and now please
focus on this inner node. It has as max y a 0, which indicates that
the max y of the subtree is 0, which is incorrect (left child is
(-5, 6))*.
What am I missing in building/searching process?
In other words:
Check the leftmost branch of the tree; it has as max_y values (2nd bullet in the quote), 7, 6, 4, 0.
But isn't that value the one that indicated the maximum y value stored in the subtree under the inner node? If so, this does not hold for 0 and point (-5, 6) (6 is not used, since its used in the 2nd level).
*The particular query I am posing might not be damaged by that, but another one can.
You last logic is actually still correct. The (-5,6) value should've already been picked up when you hit the node you labelled (6,-3).
Now, I'm no math major. But the general idea is this. In the Priority Search tree as you implemented, you're actually searching on two separate criteria.
For x, it's a simple binary search for the range.
For y, you're actually searching for it as a priority tree (good for search of y:inf or -inf:y, depending on your whether you use max or min.)
Note that at the bottom of page 15, it states that the tree is good for a semi-infinite range (one is infinite). Keep reading down, you'll see how the tree is optimized for semi-infinite range for y.
In short, since your tree is constructed with x as the binary search and y as a priority (using max remaining value), the optimal search is [x1:x2],[y1:inf].
Each node in the tree essentially stores 2 things.
1. The mid-point of x (or the max-x in the left tree, and the decision to traverse left or right is based on the >x check).
2. The POINT with the highest y value in the subtree that had not been added to previous one.
The search algorithm essentially goes like this. Starting from the root using a criteria of [x1:x2], [y1:inf]
1. Check the y-value of the POINT assigned to the node. If y > y1, go to 2, otherwise STOP traversing down (since the POINT assigned to the node has the highest y value, if the check failed, there's no other node beneath it that can fulfill [y1:inf].
2. Check if the x-value of the point is in range of [x1:x2]. If so, include that point in the output. Continue to step 3 regardless if you included that point.
3. Check the node's "mid-point" value (let's call that mx). If mx is in range of [x1:x2], traverse down both path. If mx is < [x1:x2], traverse left. Otherwise traverse right. Go back to step 1 for each path you traverse.
EDIT for very, VERY long text:
Let's run through your example. I've made an additional annotation marking each point using letter (the point's "name"). Each node now have the format of name of the point, with it's y-value in the parenthsis, and the "mid-range" x below it. For the leaf nodes, those with an '*' means they are already assigned to somewhere up the tree, and should be treated as NIL if you ever reach them.
7(E)
------ 0 ---------------------
| |
A(6) G(6)
----- -3 ----- ---- 4 --------
| | | |
C(4) D(2) F(1) I(3)
---- -4 - -2 --- 3 --- 5
| | / \ | | / \
B(0) C*(-3,4)D*(-2,2)E*(0,7) NIL H(4,-4) I*(5,3)J(6,-1)
-5 2
/ \ / \
A*(-5,6)B*(-4,0) F*(2,1) G*(3,6)
Let's run an example query of [-2:4] [1:inf](or any y >= 1)
Starting from root, we see point E.
Is the y of point E (7) >= 1? Yes.
Is the x of point E (0) in [-2:4]? Yes
Add E to output.
Is the mid-point (also 0) in range? Yes
From the node with E, need to traverse both side.
First, let's go left, we see point A.
Is the y of point A(6) >= 1? Yes.
Is the x of point A(-5) in [-2:4]? No.
Skip A, but continue traverse (only the y check stops traversion).
Is the mid-point at A in range of [-2:4]? No, it's to the left.
Since -3 < [-2:4], we only need to traverse right. Now we see point D.
Is the y of point D(2) >= 1? No! Skip the point and stop traversing down, since there's no other point under D that we have NOT outputted (note, even if E is below D, E is already outputted when we visited root at the beginning).
I traverse up to A, since we don't need to traverse the left path, keep going up.
Now I'm back at root, which needs to have both traversed. Since we just went left, we're going right. There, we see point G
Is the y of point G (6) >= 1? Yes
Is the x of point G (3) in [-2:4]? Yes
Add G to output, now we have (E,G).
Is the mid-point at G in range? Yes, we'll have to traverse both sides.
Let's go left first. We see F.
Is the y of point F (1) >= 1? Yes
Is the x of point F (2) in [-2:4]? Yes
Add F to output, now we have (E,F,G)
Is the mid-point at F in [-2:4]? Yes, we'll have to traverse both sides.
Let's go left first again, we see... NIL. There's no more points to be added below (since we already checked/added F,G before).
Let's go back up to F and travel down the right side, we see H.
Is the y of point H (-4) >= 1? No. Don't add the point and stop traversing.
We go back up to F, which already has both path traversed.
We go back up to G, which still needs its right path traverse.
We traverse down rigth path, and see I.
Is the y of point I (3) >= 1? Yes
Is the x of point I (5) in [-2:4]? No.
Skip F.
Is the mid-point at I in range of [-2,4]? No, it's greater.
Since it's greater, we only need to traverse the left side, so let's do that.
Traverse down left side we see... I, again. Since we already seen I, and it's a leaf node, we stop traversing and go back up to I.
I is done (don't need to traverse right). Go back up to G.
G is done (both sides traversed). GO back up to E.
E is done (both sides traversed). Since E is root node, we're now done.
The result is "E,F,G" are in range.

AVL tree balance

I have implemented an AVL tree, but I have a problem.
Suppose I have following tree:
And after adding another node:
Now I must rotate node5 to left:
But after rotation, it is still unbalanced.
Where am I making a mistake?
The presented scenario conforms to the Right-Left case from this Wikipedia description.
Your mistake is that you rotate the imbalanced node (5) at once, without first performing a rotation of its sub-tree.
In general having P as the unbalanced node, L as its left sub-tree and R as its right sub-tree the following rules should be followed at insertion:
balance(N) = Depth(Nleft) - Depth(Nright)
if (balance(P) > 1) // P is node 5 in this scenario
{
if (balance(L) < 0)
{
rotate_left(L);
}
rotate_right(P);
}
else if (balance(P) < -1) // P is node 5 in this scenario
{
if (balance(R) > 0) // R is node 11 in this scenario
{
rotate_right(R); // This case conforms to this scenario
}
rotate_left(P); // ... and of course this, after the above
}
So sometimes two rotations need to be performed, and sometimes only one.
This is nicely visualized at Wikipedia:
The top row shows situations when two rotations are needed. The middle row presents possible scenarios when one rotation is sufficient. Additional rotations transform any top-row scenario to the middle-row scenario.
In particular, for this tree:
After 7 is added:
The balance of 5 is 2. This conforms to the scenario marked with a comment above in the pseudo-code and also to the top-row scenario (the one on the right) in the Wikipedia picture. So before 5 is left-rotated, its right sub-tree 11 needs to be right-rotated:
And it becomes:
Only now, it's the simple case (middle-row right scenario in the Wikipedia picture) to restore balance at 5 by one left-rotation:
And the tree becomes balanced again:
Let me try to analyse more comprehensively,
For a binary tree to be avl tree, the height difference of each node from any left-most leaf to any right-most leaf must lie within {-1, 0, 1}.
Insertion for AVL:
There are four cases for AVL insertion-
L - L
L - R
R - R
R - L
Now,
case (a). [if balance > 1] L-L(left - left) A node X violates the {-1, 0, 1} constraint and
has left height more than right - gives first left of L
has a left sub child Y whose left height is greater than right .. again L
Action - Rotate about Y clockwise. ie. one right rotation.
case (b) (L -R case)Suppose some Z node is to be inserted, for Z, it is first evaluated, at which leaf, left or right, it is placed. Right, if more weight, Left if less weight.
say, Z', new node, wt(Z') > wt(Z), ie, Z' is inserted as right child of Z, case of L - R, the whole link ZZ' is rotated anti clockwise, now it is L-L case and hence solved using above case (a). ie. one Left and then one right rotation.
case (c) [if balance < -1] (R - R case) Similarly, R - R case, simply apply the binary search rule for adjustments and this case works. ie. one left rotation.
case (d) (R - L case) It is first converted to R-R case and hence solved using above case (c). ie. one right and then one left rotation.

negamax algorithm for a 3-in-a-row game on a 3x4 grid (rows x columns)

I'm struggling with the negamax-algorithm for 3-in-a-row game on a 3x4 (rows x columns) grid. It is played like the well known 4-in-a-row, i.e. the pieces are dropped (NOT like TicTacToe). Let's call the players R and B. R had the first move, B's moves are controlled by negamax. Possible moves are 1, 2, 3, or 4. This the position in question which was reached after R: move 3 --> B: move 1 --> R: move 3:
1 2 3 4
| | | | |
| | | R | |
| B | | R | |
Now, in order to defend against move 3 by R, B has to play move 3 itself, but it refuses to do so. Instead it plays move 1 and the game is over after R's next move.
I spent the whole day looking for an error in my negamax implementation which works perfectly for a 3x3 grid, by the way, but I couldn't find any.
Then I started thinking: Another explanation for the behavior of the negamax-algorithm would be that B is lost in all variations no matter what, after R starts the game with move 3 on a 3x4-grid.
Can anybody refute this proposition or point me to a proof (which I would prefer ;-))?
Thanks, RSel
It is, in fact, a won game from the start. And can be played through fairly easily by hand. I will assume that B avoids all of the 1-move wins for R, and will mark moves by color, and spot in the grid where the play happens.
1. R3,1
... B1,1 2. R3,2 B3,3 3. R4,1 B2,1 4. R2,2 (and R1,2 or R4,2 wins next)
... B2,1 2. R3,2 B3,3 3. R2,2 B2,3 4. R1,1 (and R1,2 or R1,3 wins next)
... B3,2 2. R2,1 (and R1,1 or R4,1 wins next)
... B4,1 2. R2,1 B1,1 3. R3,2 B3,3 4. R2,2 (and R1,2 or R4,2 wins next)
As for your algorithm, I'm going to suggest that you modify it to prefer wins over losses, and prefer distant losses over near losses. If you do that, it will "try harder" to avoid the inevitable loss.
Proof that B3 loses as well:
B3: R(1,2,4)->R1; B(1,2,4)->B2 (loses), so B1; R(2,4)->R2 Loses, so R4; B(2,4)->B2 loses, so B4; R loses on either choice now
...so R1 will lose for R - so R won't choose it.
B3: R(1,2,4)->R2 loses since B2, so R won't choose it
B3: R(1,2,4)->R4; B2 (forced); R2 (forced); B loses on R's next move
...so, B3 loses for B as well as B1...so B has lost in this situation.
EDIT: Just in case anyone is wondering about the other B options (2,4) at the end of "B3: R(1,2,4)->R1; B(1,2,4)->B2 (loses), so B1"...they are irrelevant, since as soon as Red chooses R1, this scenario shows that B (computer) can choose B1 and win. It doesn't really matter what happens with B's other choices - this choice will win, so Red can't choose R1 or he will lose.

Resources